In today’s post, we’ll discuss the count()
function in Python. We’ll also discuss alternatives to the count()
function, which are more efficient if we want to count the occurrences of multiple items.
For our practice question, we’ll work on a function that returns the most frequently occurring character in a string.
Here are some topics we’ll cover in today’s post
- What does the
count()
function do? - How to use the
count()
function with lists - How to use it with strings
- How to use a dictionary to count the occurrences of all characters in a string
- How to use the
Counter
class
Table of Contents
Key Concepts
The count()
function is a built-in function in Python that returns the number of times a certain element appears in a list or string. The function behaves a bit differently for lists vs strings.
count() for Python Lists
Let’s start with the count()
function for lists. This function accepts an argument – the item to count – and returns the number of times the item appears in the list.
Let’s look at some examples:
list1 = [1, 2, 3, 2, 3, 5, 1, 8, 9]
print(list1.count(3))
print(list1.count(5))
print(list1.count(4))
print()
list2 = [[1, 2], "Hello", (2, 5), [1, 2], (2, 3)]
print(list2.count([1, 2]))
print(list2.count('Hello'))
print(list2.count((2, 3)))
Here, we first declare and initialize a list called list1
with 9 numbers. Next, we use the count()
function 3 times to count the number of occurrences of 3, 5 and 4 in list1
.
After working with list1
, we declare and initialize another list called list2
with 5 elements – [1, 2]
, "Hello"
, (2, 5)
, [1, 2]
and (2, 3)
.
We then use the count()
function to count the number of occurrences of [1, 2]
, "Hello"
and (2, 3)
in list2
.
If you run the code above, you’ll get the following output:
2
1
0
2
1
1
This output indicates that the number 3 appeared two times in list1
, the number 5 appeared once while the number 4 appeared zero times.
For list2
, the list [1, 2]
appeared twice while the word 'Hello'
and the tuple (2, 3)
both appeared once.
count() for Python strings
Next, let’s look at how the count()
function works for strings. When used with strings, the count()
function returns the number of times a substring appears in a string. The function is case-sensitive and takes up to three arguments – the substring to count, the start position and the end position.
The start and end positions are optional. If omitted, the function looks for the substring in the entire string. Positions start from 0. For instance, for the string ‘Python’, ‘P’ is at position 0, ‘y’ is at position 1 and so on.
Let’s look at some examples:
msg = "Ho ho ho, how are you?"
print(msg.count('ho'))
print(msg.count('Ho'))
print(msg.count('e'))
If you run the code above, you’ll get the following output:
3
1
1
This output indicates that the substring 'ho'
appeared three times in msg
(at positions 3, 6 and 10).
The substring 'Ho'
, on the other hand, only appeared once (at the start of the string).
Last but not least, the letter 'e'
appeared once.
Next, let’s look at how to use the start and end arguments. The start argument refers to the first position to be included in the count while the end argument refers to the first position to be excluded.
Let’s look at some examples:
msg = "Ho ho ho, how are you?"
print(msg.count('h'))
print(msg.count('h', 5))
print(msg.count('h', 5, 10))
Here, we first ask the count()
function to count the number of occurrences of 'h'
in msg
(on line 2).
Next, we ask the count()
function to count the number of occurrences of 'h'
in msg
, starting from position 5.
Finally, we ask the count()
function to count the number of occurrences of 'h'
in msg
, starting from position 5 and ending at position 9 (not position 10). Position 10 refers to the first position that should be excluded from the count.
If you run the code above, you’ll get the following output:
3
2
1
We get 3 for the first example as 'h'
appeared three times in msg
, at positions 3, 6 and 10.
Next, we get 2 for the second example as this time, we ask the count()
function to start counting from position 5. Hence, the first 'h'
at position 3 is omitted.
Finally, we get 1 for the last example as we ask the count()
function to count from positions 5 to 9. Hence, the first and last 'h'
are omitted.
Practice Question
Now that we are familiar with the count()
function, let’s work on the practice question for today.
Today, our task is to write a function called mostFreqChar()
that has one parameter – msg
.
This function is case sensitive and returns a tuple with the following information – the most frequently occurring character in msg
(excluding spaces) and the number of times the character occurs.
For instance, if msg
equals 'abbbcc'
, the most frequently occurring character is 'b'
and it appeared three times. Hence, the function should return the tuple ('b', 3)
.
If there is more than one character that occurs most frequently, the function returns the first character.
For instance, if msg
equals 'abbbccddd'
, the function returns the tuple ('b', 3)
.
Expected Results
To test your function, you can use the statements below:
print(mostFreqChar('abbbcc'))
print(mostFreqChar('abbbccddd'))
print(mostFreqChar('abbbccdddDD'))
print(mostFreqChar('Good morning'))
print(mostFreqChar('Max the blue eyes cat'))
You should get the following output:
('b', 3)
('b', 3)
('b', 3)
('o', 3)
('e', 4)
Line 2 in the output above shows that the function returns the first character if there is more than one most frequently occurring character.
Line 3 shows that the function is case-sensitive. Although 'd'
and 'D'
together occur more frequently than 'b'
, the function returns ('b', 3)
as it is case-sensitive.
Suggested Solutions
There are many ways to complete the task for today.
We’ll first look at how we can complete the task using the count()
function. Next, we’ll cover two other techniques that involve some new concepts.
Solution 1
def mostFreqChar(msg):
max = 0
result = ''
for i in msg:
if msg.count(i) > max and i != ' ':
max = msg.count(i)
result = i
return (result, max)
This solution uses a for
loop (lines 5 to 8) to iterate through msg
.
For each character in msg
, we apply the count()
function to get the number of occurrences for that character. If the number of occurrences is greater than the current maximum occurrence (stored in max
) and the character is not a space (' '
), the if
conditions on line 6 evaluate to True
and we update the value of max
on line 7 and result
on line 8.
For instance, suppose msg
equals 'hello'
. The first time the loop runs, msg.count('h')
gives us 1, which is greater than the current value of max
.
Hence, we update max
to 1 on line 7 and result
to 'h'
on line 8.
The second time the loop runs, msg.count('e')
equals 1, which is not greater than max
. Hence, we do not execute lines 7 and 8.
The third time the loop runs, msg.count('l')
equals 2, which is greater than max
. Hence, we update max
to 2 on line 7 and result
to 'l'
on line 8.
This keeps repeating until we finish iterating through the entire string.
When that happens, we exit the for
loop and return the tuple (result, max)
on line 9.
With that, Solution 1 is complete.
This solution should be quite easy to understand. We simply use a for
loop to iterate through msg
and apply the count()
function to each character in msg
.
However, the solution is not very efficient. This is because each time the count()
function is called, Python needs to iterate through msg
to do the counting.
For instance, suppose the length of msg
is 20. Using the for
loop in Solution 1, we need to call the count()
function 20 times.
Each time the count()
function is called, Python needs to use another loop to loop through msg
20 times to do the counting. This can lead to a significant performance lag if msg
is very long.
Let’s look at another solution that is more efficient. This solution does not use the count()
function.
Solution 2
def mostFreqChar(msg):
msg = msg.replace(' ', '')
result = {}
for i in msg:
if i not in result:
result[i] = 1
else:
result[i] += 1
max1 = max(result, key=lambda x:result[x])
return (max1, result[max1])
Here, we first use the built-in replace()
function to replace all the spaces in msg
with ''
. This results in all the spaces in msg
being removed. We need to remove spaces as the practice question requires us to exclude spaces when finding the most frequently occurring character in msg
.
After removing the spaces, we declare an empty dictionary called result
.
Next, we use a for
loop to iterate through msg
. Inside the for
loop, we check if the current character is already in the result
dictionary (on line 6). If it is not, we add the character as a key to the dictionary, with a value of 1.
If the character is already in the dictionary, we increment the value of the character by 1 (on line 9).
For instance, suppose the string is 'aab'
. The first time the loop runs, the character 'a'
is not in the result
dictionary (since result
is an empty dictionary at this point).
Hence, the if
condition on line 6 evaluates to True
and we add the pair 'a':1
to the result
dictionary. As a result, result equals {'a': 1}
.
The second time the loop runs, the character 'a'
is already in result
. Hence, the if
condition on line 6 evaluates to False
and the else
block is executed. As a result, result
becomes {'a':2}
.
Finally, the last time the loop runs, the character 'b'
is not in result
and we add the pair 'b':1
to the dictionary. Hence, result
becomes {'a':2, 'b':1}
.
After we finish looping through msg
, we use the built-in max()
function to get the dictionary item with the largest value. You can refer to the previous post for details on how the max()
function works.
Finally, we return a tuple with the key (max1
) and value (result[max1]
) of the most frequently occurring character.
Solution 3
Finally, let’s look at one last solution for today’s question. This solution uses a built-in Python class called Counter
, which is in the collections
module.
The Counter
class does something very similar to the for
loop in Solution 2.
For instance, if you write
result = Counter('aab')
print(result)
you’ll get the following output:
Counter({'a': 2, 'b': 1})
From the output above, you can see that result
is a Counter
object, which is actually a dictionary. This is because the Counter
class is a subclass of the dict
class.
We can pass this Counter
object to the max()
function, similar to what we did in Solution 2.
Here’s how the Counter
class can be used to solve today’s question:
from collections import Counter
def mostFreqChar(msg):
msg = msg.replace(' ', '')
result = Counter(msg)
max1 = max(result, key=lambda x:result[x])
return (max1, result[max1])
On line 1, we first import the Counter
class.
Next, we have the mostFreqChar()
function, which is very similar to the function in Solution 2.
The only difference is we replaced lines 3 to 9 in Solution 2 with a single line (line 5) in Solution 3. Other than that, the two solutions are identical.