How to Find Intersection of Two Lists in Python


There are three main methods to find the intersection of two lists in Python:

  1. Using sets
  2. Using list comprehension
  3. Using the built-in filter() function

This post explains the code for each method and discusses various factors to consider when deciding which method to use.

For our practice question, we’ll work on a function that finds the intersection of two lists with duplicate items in each list. For instance, if we pass the lists [1, 1, 1, 2, 3, 3, 5] and [1, 1, 3, 6, 7] to the function, it should return the list [1, 1, 3].

Using sets to find intersection of two lists in Python

The first method to find the intersection of two lists in Python is to convert the lists to sets and use either the & operator or the built-in intersection() method.

But first, what is a set?

A set is similar to a list, in that it can be used to store a collection of items (such as a collection of numbers).

For instance, we can create a set to store the numbers 1, 22, 43, 64 and 57.

To do that, we use curly braces, as shown in the example below:

mySet = {1, 22, 43, 64, 57}

As you can see, creating a set is very similar to creating a list, except that the former uses curly braces while the latter uses square brackets.

A set is indeed very similar to a list. However, there are a few key differences between them, as explained in the table below:

Key differences between a set and a list

ListSet
Items are ordered (i.e. the order of the items is important).

Hence, [1, 2, 3, 4] and [2, 3, 1, 4] are considered to be two different lists

Items are not ordered (i.e. the order of the items is disregarded).

Hence, {1, 2, 3, 4} and {2, 3, 1, 4} are considered to be the same set

Items need not be distinct.

[1, 1, 1, 2, 3] is all right

Items must be distinct.

{1, 1, 1, 2, 3} gives us an error as the number 1 appears more than once in the set

Items need not be hashable*Items must be hashable*

Examples of items that are hashable include integers, floating-point numbers and strings.

Examples of items that are not hashable include nested lists or dictionaries

  • Hashable items refer to items that have a hash value which never changes during its lifetime. A full discussion of this is beyond the scope of this tutorial. If you are interested, you can check out the official documentation at https://docs.python.org/3/glossary.html#term-hashable.

To find the intersection of two lists in Python, we can convert the lists to sets and use the built-in & operator or the intersection() method.

Let’s look at some examples:

Converting lists to sets and using the & operator

list1 = [1, 2, 3, 4, 5]
list2 = [3, 4, 5, 6, 7]

set1 = set(list1)
set2 = set(list2)

intersect = list(set1 & set2)

print(intersect)

Here, we first declare two lists – list1 and list2 – on lines 1 and 2.

Next, on lines 4 and 5, we use the set() function to convert the lists to sets.

On line 7, we use the & operator to find the intersection between the two sets. This operator returns a new set with elements common to both set1 and set2. We pass this resulting set to the list() function to convert it back to a list and assign the resulting list to intersect.

Finally, we print the value of intersect on line 9.

If you run the code above, you’ll get the following output:

[3, 4, 5]

Converting lists to sets and using the intersection() method

Next, let’s look at an example that uses the intersection() method.

While the & operator can only be used on two sets, the intersection() method can be used with other types of iterables, such as a list or tuple.

Its syntax is as follows:

<name of set>.intersection(<names of interables>)

Let’s look at an example:

list1 = [1, 2, 3, 4, 5]
list2 = [3, 4, 5, 6, 7]

set1 = set(list1)

intersect = list(set1.intersection(list2))

print(intersect)

Here, we only convert list1 to a set. This is because we can pass a list (e.g. list2) directly to the intersection() method, without having to convert the list to a set.

We do that on line 6, where we use set1 to call the intersection() method and pass list2 as an argument to the method.

This method returns a set. We convert the set back to a list using the list() function, and assign the result to intersect.

Finally, we print the value of intersect on line 8.

If you run the code above, you’ll get the following output:

[3, 4, 5]

Error in converting lists to sets

Last but not least, let’s look at an example where we are unable to convert a list to a set. That happens when one or more of the items in the list is not hashable. For instance, if the list has a nested list (which is not hashable), we’ll get an error when we try to convert the list to a set.

list1 = [1, 2, 3, 4, 5, [7, 8, 9]]

set1 = set(list1)

If you run the code above, you’ll get the following error:

Traceback (most recent call last):
  File "...", line ..., in <module>
    set1 = set(list1)
TypeError: unhashable type: 'list'

Using list comprehension

In the previous section, we saw how we can convert lists to sets and use the & operator or intersection() method to find the intersection of two lists in Python.

However, we also saw how that can fail when our lists contain unhashable items (such as a nested list).

In cases like that, we can use list comprehension to find the intersection of two lists. This does not require us to convert any list to a set.

To do that, we need to first understand how list comprehension works. If you are unfamiliar with list comprehension, you may want to refer to some previous posts, such as this and this.

To recap, one possible way to use list comprehension is to use the following syntax:

[<item to add to new list> <for loop to iterate through existing list> <if condition>]

Using list comprehension to create a new intersection list

Suppose we have two lists – list1 and list2.

If we want to add items in list2 to a new list called intersect, only when the item in list2 is also in list1, we can use the code below:

list1 = [1, 2, 3, 4]
list2 = [3, 4, 5, 6]

intersect = [x for x in list2 if x in list1]
print(intersect)

Line 4 in the code above says that for each item in list2 (for x in list2), add the item (x) to intersect if the item is in list1 (if x in list1).

If you run the code above, you’ll get the following output:

[3, 4]

This gives us the intersection of the two lists – list1 and list2.

Using list comprehension to find intersection of two lists with nested lists

Next, let’s look at another example where the lists contain nested lists.

list1 = [1, 2, [12, 5], [6, 7], 9, 10]
list2 = [2, [6, 7], 15, 17]

intersect2 = [x for x in list2 if x in list1]

print(intersect2)

If you run the code above, you’ll get the following output:

[2, [6, 7]]

This shows that list comprehension works even with nested lists.

Using the built-in filter() function

Last but not least, let’s look at the third method to find the intersection of two lists in Python – using the filter() function.

The filter() function is a built-in function in Python that accepts two arguments – a function that defines the criteria for filtering and an iterable to be filtered.

Suppose we have a list called numbers, defined as follows:

numbers = [1, 2, 3, 4, 5, 6, 7, 8]

If we want to filter all the even numbers in the list, we can define a function that returns True when it is passed an even number.

We can then pass this function and the numbers list to the filter() function.

Let’s look at an example of how this works.

numbers = [1, 2, 3, 4, 5, 6, 7, 8]

def isEven(x):
    if x%2 == 0:
        return True

even_numbers = list(filter(isEven, numbers))

print(even_numbers)
print(numbers)

Here, we first define the numbers list on line 1.

Next, we define a function called isEven() that accepts one argument x and returns True if x is even (i.e. if x gives a remainder of zero when divided by 2).

Next, we call the filter() function on line 7, passing the function name (isEven) and the iterable name (numbers) to the function.

The filter() function returns a filter object which can be converted to a list, using the built-in list() function.

We do that on line 7 and assign the resulting list to a variable called even_numbers.

Finally, we print the values of even_numbers and numbers on lines 9 and 10.

If you run the code above, you’ll get the following output:

[2, 4, 6, 8]
[1, 2, 3, 4, 5, 6, 7, 8]

even_numbers only contains even numbers as these are the items in numbers that “passed” the filter criteria, as defined by the isEven() function.

numbers, on the other hand, is not changed after we pass it to the filter() function.

As mentioned previously, we can use the filter() function to find the intersection of two lists in Python. Let’s look at some examples.

Using the filter() function to find intersection of two lists in Python

list1 = [1, 2, 3, 4]
list2 = [3, 4, 5, 6]

def intersectList1(x):
    if x in list1:
        return True
    
intersect = list(filter(intersectList1, list2))

print(intersect)

Here, we first define a function called intersectList1() on lines 4 to 6. This function returns True if x is in list1.

Next, we use the function to filter list2 on line 8. We also pass the resulting filter object to the list() function to convert it to a list.

Finally, we print the value of the resulting list on line 10.

If you run the code above, you’ll get the following output:

[3, 4]

Using the filter() function with a lambda function

Next, let’s look at one more example of using the filter() function to find the intersection of two lists in Python. This time, instead of passing a named function to the function, we’ll pass a lambda function. If you are unfamiliar with lambda functions, you can refer to this post to see how lambda functions work.

list1 = [1, 2, 3, 4]
list2 = [3, 4, 5, 6]

intersect = list(filter(lambda x: x in list1, list2))

print(intersect)

Here, the lambda function is

lambda x: x in list1

The function evaluates the expression x in list1 for each input x and return True if x is in list1.

We pass this lambda function to the filter() function on line 4 to filter the items in list2.

If you run the code above, you’ll get the following output:

[3, 4]

Not surprisingly, this example gives the same output as the preceding example as its lambda function does the same thing as the intersectList1() function in the preceding example.

Which Method to Use?

Now that we’ve covered three different methods to find the intersection of two lists in Python, let’s discuss some factors to consider when choosing which method to use.

The first factor to consider is whether your list contains un-hashable items (such as a nested list).

If it does, you’ll have to use either list comprehension or the filter() function.

Next, the second factor to consider is whether your list has duplicate items. If your list has duplicate items and you want the intersection to account for these items, none of the methods above is ideal.

Let’s look at some examples:

list1 = [1, 2, 3, 3, 5, 5, 6, 7, 8]
list2 = [1, 3, 3, 3, 5, 6, 6, 9]

# Example 1: Using sets

set1 = set(list1)
set2 = set(list2)

intersect_1 = list(set1 & set2)
intersect_2 = list(set1.intersection(list2))
print('Using sets')
print(intersect_1)
print(intersect_2)

# Example 2: Using list comprehension

intersect_3 = [x for x in list1 if x in list2]
intersect_4 = [x for x in list2 if x in list1]

print('Using list comprehension')
print(intersect_3)
print(intersect_4)

# Example 3: Using filter()

intersect_5 = list(filter(lambda x : x in list1, list2))
intersect_6 = list(filter(lambda x : x in list2, list1))

print('Using filter function')
print(intersect_5)
print(intersect_6)

If you run the code above, you’ll get the following output:

Using sets
[1, 3, 5, 6]
[1, 3, 5, 6]
Using list comprehension
[1, 3, 3, 5, 5, 6]
[1, 3, 3, 3, 5, 6, 6]
Using filter function
[1, 3, 3, 3, 5, 6, 6]
[1, 3, 3, 5, 5, 6]

When we convert a list to a set, duplicate items are removed. For instance, if myList = [1, 1, 1, 2, 3], set(myList) gives us the set {1, 2, 3}.

Therefore, in the first example above, when we convert list1 and list2 to sets, and use the & operator or the intersection() method to find the intersection of the two sets, the resulting set does not contain any duplicate items.

Hence, we get [1, 3, 5, 6] as the output for both intersect_1  and intersect_2.

Next, for the second example, notice that we get a different answer for intersect_3 and intersect_4?

This is because for intersect_3, we iterate through list1 to check if the item is also in list2. On the other hand, for intersect_4, we iterate through list2 to check if the item is also in list1.

This gives us different results as the result for intersect_3 is based on list1. Hence, the number 3 appears twice in intersect_3 as there are two 3s in list1. Similarly, the number 5 appears twice as there are two 5s in list1.

In contrast, the result for intersect_4 is based on list2. Hence, the number 3 appears thrice in intersect_4 while the number 5 appears once.

The same difference in answers can be found when we use the filter() function to find the intersection of list1 and list2 (refer to example 3).

If we do not want such discrepancy in our answers, we’ll need to write our own function.

This bring us to the practice question for today.

Practice Question

The practice question for today is to write a function called intersectWithDup() that accepts two lists as arguments. We can assume that the two lists only contain hashable items.

The function finds the intersection of the two lists, based on the number of times an item appears in both lists.

For instance, suppose we have the lists [1, 1, 2, 2, 2, 3] and [1, 1, 1, 2, 3, 3, 4, 4], our function should return [1, 1, 2, 3] as the result.

Although 1 appears three times in the second list, it only appears twice in the first. Hence, 1 appears twice in the intersection.

Similarly, although 2 appears three times in the first list, it only appears once in the second. Hence, 2 appear once in the intersection.

As you can see, the intersection is not based on the first or the second list. Instead, it is based on the number of times an item appears in both lists.

Expected Results

To test your function, you can run the following statements:

list1 = [1, 1, 2, 2, 2, 3]
list2 = [1, 1, 1, 2, 3, 3, 4, 4]
print(intersectWithDup(list1, list2))

list1 = [1, 2, 3, 4, 5, 5, 5]
list2 = [2, 2, 2, 2, 3, 4, 7, 7, 8]
print(intersectWithDup(list1, list2))

If you run the code above, you should get the following output:

[1, 1, 2, 3]
[2, 3, 4]

Hints

There is more than one way to complete the practice question for today. The suggested solution uses the Counter class and the extend() method for lists.

You may want to refer to the following posts if you need help:

Suggested Solution

Here’s the suggested solution for today’s question:

Click to see suggested solution
from collections import Counter

def intersectWithDup(list1, list2):

    a = Counter(list1)
    b = Counter(list2)

    set1 = set(list1)
    set2 = set(list2)

    list3 = list(set1 & set2)
    list4 = []

    for i in list3:
        list4.extend([i]*min(a[i], b[i]))

    return list4

Here, we first import the Counter class on line 1.

Next, we define the intersectWithDup() function from lines 3 to 17.

Within the function, we pass list1 and list2 to the Counter() constructor and assign the resulting Counter objects to a and b.

A Counter object gives the number of times an item appears in an iterable.

For instance, Counter(['p', 'p', 'q']) gives us the Counter object {'p': 2, 'q': 1}, as 'p' appears twice ('p': 2) in ['p', 'p', 'q'] while 'q' appears once ('q': 1).

If we assign this object to a variable, say myCounter, we can access the values in myCounter like how we access values in a dictionary.

For instance, myCounter['p'] gives us the value 2.

In our suggested solution, a and b give us the frequency of each item in list1 and list2.

After we get this frequency, we are ready to find the intersection of the two lists. We first find the intersection without duplicates. To do that, we convert list1 and list2 to sets and use the & operator to get the intersection.

Next, we convert this intersection to a list and assign the resulting list to list3 (on line 11).

Next, we initialize an empty list called list4.

We then use a for loop to iterate through list3 (lines 14 and 15).

On line 15, a[i] and b[i] give the number of times i appears in list1 and list2 respectively.

For instance, suppose

list1 = [1, 1, 2, 3]
list2 = [1, 1, 1, 2, 2, 3]

list3 equals [1, 2, 3].

The first time the for loop runs, i equals 1.

a[1] gives us 2 while b[1] gives us 3.

min(a[1], b[1]) gives us the lower of the two numbers, which is 2.

[1]*min(a[1], b[1]) gives us [1]*2, which equals [1, 1].

We then pass this list to the extend() method to add it to list4.

We keep doing this until we finish iterating through all the items in list3.

When that happens, list4 contains all the duplicate items that we need for the intersection. Hence, we simply return list4 on line 17.

With that, the function is complete.

Written by Jamie | Last Updated November 3, 2020

Recent Posts