Throughout the course, you've seen me refer to strings ... variables declared like this:
x = 'Hello' # x is a stringIt's time to start thinking about what we can do with strings.
You can compare two strings with ==.
We should start by not thinking about it like a variable.
A string is an array of characters. In Python a string is an object.
An array is a data structure that holds a series of variables with an index. If we think of a variable like a jar with a label and something inside it, it might look like this:
But what about arrays? Think of a series of jars, with one label and an index. Each jar is holding a value.
A string is an array of characters. The string 'HELLO' is actually a series of individual characters 'H','E','L','L','O'.
Once we begin to think of a word, or a sentence, or a paragraph as a series of individual characters, we can begin to have fun with strings.
A substring is a string that may or may not be in a string.
This is the best point in the course to read the Python documentation. Specifically I mean The Python Standard Library, Chapter 4. It contains all the string functions you need to know for string manipulation.
I've gone through the Python string library and picked out some functions that may interest you. Just be aware that I am using most of these functions in their simplest form, and optional parameters are not being used. Please read the documentation before you reinvent the wheel.
In the following examples, I'm dumping the output of the string function to a print() so that you can see it.
# Build strings for later usemystr = 'abcbdefghija'mynum = '1234'mytitle = 'This Is My Title'dot = '.'letters = ('A', 'B', 'C')mess = ' asdasdas dd 'intab = "aeiou"outtab = "12345"str = "this is string example....wow!!!"messy = 'Change The Case'commaland = 'a,b,c,d,e,f'print(mystr.capitalize()) # Capitalize first characterprint(mystr.center(20, '$')) # Centre text. Pad with provided textprint(mystr.count('a')) # Count occurrences of a substringprint(mystr.endswith('ija')) # Check suffix (ending), return Booleanprint(mystr.find('b')) # Find and return first occurrence of substring, if substring not found return -1'12' in '1234' # Use instead of find() to verify the substringprint(mynum.isalnum()) # Return true if all characters in the string are alphanumeric and there is at least one character, false otherwise.print(mystr.isalpha()) # Return true if all characters in the string are alphabetic and there is at least one character, false otherwise.print(mynum.isdigit()) # Return true if all characters in the string are digits and there is at least one character, false otherwise.print(mytitle.istitle()) # Return true if the string is a titlecased string and there is at least one character, false otherwise.print(mytitle.isupper()) # Return true if all cased characters in the string are uppercase and there is at least one cased character, # false otherwise.print(dot.join(letters)) # Return a string, which is the concatenation of the strings in the sequence.print(mytitle.lower()) # Return a copy of the string with all the cased characters converted to lowercase.print(mess.lstrip()) # Return a copy of the string with leading characters removed. # The chars argument is a string specifying the set of characters to be removed. # If omitted or None, the chars argument defaults to removing whitespace. # The chars argument is not a prefix; rather, all combinations of its values are stripped # maketrans() links the translation table (links), translate() applies the tableprint(str.translate(str.maketrans(intab, outtab))) print(mystr.replace('a','Q')) # Return a copy of the string with the substrings replacedprint(mystr.rfind('b')) # Find and return last occurrence of substring, if substring not found return -1print(mess.rstrip()) # Return a copy of the string with trailing characters removed. # The chars argument is a string specifying the set of characters to be removed. # If omitted or None, the chars argument defaults to removing whitespace. # The chars argument is not a prefix; rather, all combinations of its values are strippedprint(commaland.split(',')) # Return a list of the words in the string. With or without delimiter stringprint(mystr.startswith('abc')) # Check prefix (beginning), return Booleanprint(mystr.strip('ab')) # Return a copy of the string with the leading and trailing characters removedprint(messy.swapcase()) # Return a copy of the string with uppercase characters converted to lowercase and vice versa.print(str.title()) # Return a titlecased version of the string where words start with an uppercase character # and the remaining characters are lowercase.print(str.upper()) # Return a copy of the string with all the cased characters converted to uppercase.# FUN STRING STUFF NOT DIRECTLY PART OF THE STRING LIBRARYprint(len(mystr)) # len() returns the length of an object. Strings in Python are objects. Not a String # library function. Standard Python Library.print('*' * 10) # use the multiply operator for printing repeated charactersprint (messy + mystr) # use the concatenation operator for joining two stringsprint(":".join(mystr)) # provide join() a sequence and separate it with the string # calling join. In this case # a ':' string called join()This is a just a glimpse of the Python string library. Knowing it will help you to not reinvent the wheel when you try to solve string based problems.
Python can slice an array if you tell it where to start.
word = 'sesquipedalian'i = 0print(word[i]) # subscript (indexing) of the stringprint(word[1]) # positive moves to the rightprint(word[-1]) # negative moves to the left, supercoolprint(word[i:10]) # sliceprint(word[0:2]) # slice characters from position 0 (included) to 2 (excluded)print(word[2:5]) # slice characters from position 2 (included) to 5 (excluded)print(word[:2]) # slice character from the beginning to position 2 (excluded)print(word[4:]) # slice characters from position 4 (included) to the endprint(word[-2:]) # characters from the second-last (included) to the endOne way to remember how slices work is to think of the indices as pointing between characters, with the left edge of the first character numbered 0. Then the right edge of the last character of a string of n characters has index n, for example:
+---+---+---+---+---+---+ | P | y | t | h | o | n | +---+---+---+---+---+---+ 0 1 2 3 4 5 6-6 -5 -4 -3 -2 -1The first row of numbers gives the position of the indices 0...6 in the string; the second row gives the corresponding negative indices. The slice from i to j consists of all characters between the edges labeled i and j, respectively.
Python deals with data structures in several ways: lists, tuples, and dictionaries.
We said that strings were really an array (or collection) of characters.
The term array in computer science means a collection of like datatypes which are indexed. Python doesn't care about the 'like datatypes' bit!
We looked at arrays when we talked about strings.
In Python, the concept of arrays is subdivided into several parts. Python has several data structures to deal with arrays.
caf_specials = [ 'breakfast sandwich', 'milk', 'rice', 'pizza', 'salad', 'fish and chips' ]Well, now you can begin to worry about it!
Just kidding ... lists are easy to understand. They hold a series of data, and they have an index so you can access the data. In Python, you can mix the datatypes in a list.
list1 = ['dogs', 'cats', 2, 1] # a mixed list of strings and numberslist2 = [1, 2, 3, 4, 5 ] # a list of numberslist3 = ['a', 'b', 'c', 'd'] # a list of stringslist1 = ['dogs', 'cats', 2, 1] # a mixed list of strings and numberslist2 = [1, 2, 3, 4, 5 ] # a list of numberslist3 = ['a', 'b', 'c', 'd'] # a list of stringsprint(list1[0]) # read from the list using a hard-coded valuei = 2print(list1[i]) # read from the list use a variablelist1[3] = 'hamster' # write to the list using a hard-coded valuelist1[i] = 'bird' # write to the list using a variableprint(list1) # this gives us proof of output, but it might not be in a user friendly format. # it still looks like a list.Please refer to the slicing note!
Lists start at index 0, they support indexing, slicing, + (concatenation), * (repetition / multiplying), and negative number indexing.
list3 = ['a', 'b', 'c', 'd']list2 = [1, 2, 3, 4, 5 ]list3 = list3 * 2 # repetitionprint(list3)list3 = list3 + list2 # concatenationprint(list3)print(list3[0:5]) # slicingprint(list2[-1]) # indexing, specifically: negative number indexinglist3 = ['a', 'b', 'c', 'd']list2 = [1, 2, 3, 4, 5 ]del(list3[0]) # Remove an item or a slice from a listprint(list3)print(len(list2)) # length of listprint('b' in list3) # check for membershipfor i in list3: # use for loop counter to access elements print(i)list3 = ['a', 'b', 'c', 'd']list2 = [1, 2, 3, 4, 5 ]list4 = ['z', '2', 'g', '1', 'a']list3.append('qq') # add an item to the end of the listprint(list3)list3.extend(['ww','tt']) # add a series of elements to the listprint(list3)list3.insert(4, 'yyyyy') # Insert an item at a given position.print(list3) # The first argument is the index of the # element before which to insertlist3.remove('ww') # Remove the first item from the list that matchedprint(list3)list3.pop(0) # Remove the item at the given position in the list, and return it.print(list3) # If no index is specified, a.pop() removes and returns the last item in the list.list2.clear() # Equivalent to delprint(list3)print(list3.index('qq')) # Return the index of the first item providedprint(list3.count('qq')) # Return the number of times item appearslist4.sort() # Sort the items of the list. Items must be of like typeprint(list4)list4.reverse() # Reverse the elements of the listprint(list4)list2 = list4.copy() # shallow copy of list ... yes, there's a deepcopy() tooprint(list2)# This looks like a copy, and it works, but not the way you might thinkfoo = [1,2,3]boo = fooboo.pop()print(boo)print(foo)#If you manipulate either foo or boo, they'll both change because they are referencing the same (one, singular, 1) list#use copy() to make a shallow copy of your listoriginal = [1,2,3]clone = original.copy()clone.pop()print(clone)print(original)Understanding Lists in Python 3 is also a good resource.
Tuples are a data structures that cannot be changed. Think of tuples like a permanent (immutable) list. Tuples are covered in the Python Documentation.
myList = [22, 4, 99] #build a list, elements can changemyTuple = (22, 4, 99) #build a tuple, elements CANNOT change#I want to get input, and hold it in a list.myList[0] = input('Change the value in the first element of the list: ')print(myList) #see, it changed#YOU CANNOT CHANGE A TUPLE: myTuple[0]=input('change a value: ') #CRASH!!myNewTuple = tuple(myList) #CAST myList INTO a TUPLEprint(myNewTuple) #print the TuplesomeList = list(myNewTuple) #CAST myNewTuple INTO a LISTprint(someList) #print the LISTDictionaries are a data structure that stores values in pairs. Think of dictionaries as data with a relationship, like a country and its capital ... or a friend and their phone number. Dictionaries are covered in the Python Documentation.
contacts = {'bob':6471234567, 'sally':6477778888} #build a dictionary#you can also build dictionaries using the dict() constructor. Check out the API for detailsprint(contacts) #print the dictionary, notice the unfriendly output#use a for loop to make the output more readablefor name,tel in contacts.items(): #use the item() to access the key and value print(name, tel) contacts['jojo'] = 6475556666 #add a key and value to the dictionaryprint(contacts)aListofKeysOnly = list(contacts.keys()) #Just the keys, put in a listprint(aListofKeysOnly)aListofValuesOnly = list(contacts.values()) #Just the values, put in a listprint(aListofValuesOnly)del contacts['bob'] #where did bob go?print(contacts)There's more functionality within the short tutorial in the documentation. Check it out.