Strings in Python (continued)#

Example#

Suppose a problem asks you to calculate the len of list_example and store it in a variable called length.

list_example = [5, 6, 10]
### This is the wrong way to write your answer
length = 3
### This is the right way to write your answer
length = len(list_example)

Review: strings#

A string is a sequence of characters. It belongs to the str type in Python.

A string stores characters as text, and is created using single ('') or double ("") quotes; multi-line strings can be created with three quotes (""" """).

print("This is a string that I'm printing.")
This is a string that I'm printing.
print("This is also a string, even though it has numbers like 2, 4, and 100 * 2.")
This is also a string, even though it has numbers like 2, 4, and 100 * 2.

Goals of this lecture#

Today, we’ll cover a variety of advanced operations we can use with strings. These are not exhaustive (see here for a huge list), but they will highlight the flexibility and utility of strings.

  • Modifying case (.upper(), .lower(), and .title())

  • Replacing characters (.replace())

  • Concatenating strings using +.

  • Splitting a string (.split()).

Modifying case#

Often, you’ll need to modify the case of a str (i.e., make it either upper or lower case).

  • One use-case for this is needing to compare two strings, but not caring about whether they have identical case.

  • E.g., “APplE” is the same word as “apple”, but these strings wouldn’t evaluate as equal.

"APPLE" == "apple"
False
"apple" == "apple"
True

upper and lower#

As the names imply, upper and lower are both functions that you can use on a str.

"APPLE".lower()
'apple'
"apple".upper()
'APPLE'
"APPLE".lower() == "apple"
True

title#

The title function is a variant of upper/lower, which just capitalizes the first letter of each word.

og_string = "introduction to programming for computational social science"
og_string.title()
'Introduction To Programming For Computational Social Science'

Note that if you have capital letters after the first letter of a word, these will now become lowercase!

og_string = "CSS"
og_string.title()
'Css'

Evaluating case#

Just as you can modify the case of these strings, you can also evaluate it:

  • isupper()

  • islower()

  • istitle()

These functions all check whether a string conforms to those patterns.

"CSS".isupper()
True
"CSS".islower()
False
"I Love Programming".istitle()
True

Check-in#

If you called istitle() on the following string, would it evaluate to True or False?

test_str = "I love CSS"
### Your answer/code here

Other helpful evaluation methods#

There are a few other helpful methods for evaluating properties of a string:

  • isdigit: checks if the characters are entirely digits (e.g., \(0, 1, ..., 9\))

  • isalpha: checks if the characters are entirely alphabetic characters (e.g., abcd...).

  • isspace: checks if the string is entirely space characters (e.g., ).

Replacing characters#

Another common operation is replacing elements of a string.

Examples:

  • In a list of filenames, replacing every - with a _.

  • Removing certain words or characters, e.g., replacing every instance of a word with a .

This can be done with the replace function.

## Replace "-" with "_"
og_filename = "css-lab-1"
og_filename.replace("-", "_")
'css_lab_1'

Replacing the first \(N\) instances#

replace can also be used to replace only the first \(N\) instances of a string.

## Replace only the first instance of "bananas"
og_string = "bananas, apples, bananas, grapes"
og_string.replace("bananas", "oranges", 1)
'oranges, apples, bananas, grapes'

Check-in#

Use the replace function to replace the first 2 instances of - with _.

original_filename = "css-ps1-fa22-test.py"
### Your code here

replace is case-sensitive#

Note that replace attempts an exact match of the str you’re looking to replace.

  • This includes exact case match.

  • "apple" != "APPLE".

case_mismatch = "I like Apples"
### replace won't do anything here
case_mismatch.replace("apples", "bananas")
'I like Apples'
case_mismatch = "I like Apples"
### replace will replace it here
case_mismatch.replace("Apples", "bananas")
'I like bananas'

Concatenating strings#

String concatenation simply means combining multiple strings.

Often, you’ll need to combine the characters in multiple strings.

  • Combining the directory path and a filename to get the full path of a file.

  • Combining parts of strings to get a valid URL.

  • Combining the first and last name of a client to print out the full name.

Approach 1: the + operator#

The + operator can be used to combine multiple str objects.

"Comput" + "ational"
'Computational'
"css1/" + "lab1/" + "file.py"
'css1/lab1/file.py'

Check-in#

What do you notice about how these strings are combined? Is a space added between each constituent str or no?

Watch out for spaces (and lack thereof)!#

By default, + will just combine two different string objects directly.

That is, "Hello" + "World" will become "HelloWorld".

If you want to add a space between these objects, make sure to add a space character in your concatenation operation.

p1 = "Hello"
p2 = "World"
p1 + " " + p2
'Hello World'

Check-in#

Why does the code below throw an error?

Bonus: What would you need to do to make it not throw an error?

2 + " cats"
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [25], in <cell line: 1>()
----> 1 2 + " cats"

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Concatenating an int to a str#

The + operator assumes you are concatenating multiple str objects. Thus, trying to combine an int with a str this way will throw an error.

However, you can use type-casting to turn the int into a str, and then combine them.

str(2) + " cats"
'2 cats'

Check-in#

Use the + operator to combine the variables below into a single string (in order, i.e., var1 followed by var2, etc.).

  • Add a space between each variable.

  • Watch out for conflicting types!

var1 = "This"
var2 = "Is"
var3 = "CSS"
var4 = 1
#### Your code here

Approach 2: using format#

The format method can also be used to merge multiple strings together.

  • This approach is less intuitive at first, but is very flexible.

  • I use this approach when I’m printing out lots of custom variable values, e.g., as in an output message.

With format, you can declare “variables” within a str using the {x} syntax.

first = "Sean"
last = "Trott"
print("Hello, {f} {l}".format(f = first, l = last))
Hello, Sean Trott

Check-in#

Use format to print out a message that reads:

"Welcome to CSS 1".

department = "CSS"
number = "1"
#### Your code here

Approach 3: using join#

Another somewhat common use-case is joining strings that are currently stored as elements of a list.

The join syntax starts with the character (or characters) you’ll be using to join each str together.

  • This could be a space character, an underscore, or anything you want.

  • It then makes a call to .join(list_name).

separate_str = ['The', 'quick', 'brown', 'fox', 'jumped']
separate_str
['The', 'quick', 'brown', 'fox', 'jumped']
" ".join(separate_str)
'The quick brown fox jumped'

Check-in#

Use join to turn the following list of directory and sub-directory names into a full file path, connected by the "\" symbol.

dirs = ["css", "1", "labs", "lab1"]
#### Your code here

Other approaches#

There are a number of other approaches to concatenating strings.

Personally, I primarily use:

  • The format operator when I’m printing out complicated strings.

  • The + operator for everything else.

splitting a string#

Just as you can join parts of a list into a str, you can also split a str into a list!

Common use cases:

  • Extracting directories and sub-directories of a file path.

  • Tokenizing a sentence, i.e., retrieving all the distinct words (e.g., in English, written words are typically separated by spaces).

  • Extracting different hash-tags from a tweet (e.g., "#CSS#Programming").

example_sentence = "The quick brown fox jumped over the lazy dog"
example_sentence.split(" ")
['The', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']

Check-in#

How many words (i.e., character-sequences separated by spaces) are in the sentence below?

Hint: use a combination of split and len to solve this question.

test_sentence = "This sentence has a number of different words and your goal is to count them"
### Your code here

Conclusion#

Strings are ubiquitous in Python. Now that you’ve had a brief introduction, you’ll have a better understanding of:

  • Indexing and looping through strings.

  • Checking str objects for various features, e.g., whether they are upper or lower-case.

  • Using split and join to convert str to and from list objects, respectively.

Coming up next, we’ll explore lists in more depth.