Sets
Sets
Sets are another mutable type. Python sets are unordered collections of objects. Like the sets you’ve seen in your mathematics course, each element in a Python set must be distinct. Like dictionary keys, the elements of a Python set must be hashable.
We write set literals with curly braces (like dictionaries), but without the colon since sets aren’t comprised of key / value pairs.
>>> s1 = {1, 2, 3}
>>> s2 = {'red', 'green', 'blue'}
Just like lists and tuples (and other types) there’s a set constructor set()
, which takes some iterable and returns a set constructed from the iterable.
>>> set([1, 2, 3])
{1, 2, 3}
>>> set(('red', 'green', 'blue')) # notice order isn't preserved
{'red', 'blue', 'green'}
>>> set(range(1, 6))
{1, 2, 3, 4, 5}
>>> set({'a': 'b', 'c': 'd'}) # iterates over keys by default
{'a', 'c'}
Sets have abundant applications, not the least of which is removing duplicates from a list.
>>> list_with_duplicates = [1, 1, 2, 2, 5, 3, 3, 4]
>>> list(set(list_with_duplicates))
[1, 2, 3, 4, 5]
If you try to construct a set with any iterable containing duplicates, or with a set literal containing any duplicates, the duplicates will be removed.
>>> {1, 2, 2, 3}
{1, 2, 3}
>>> set((0, 0, 0, 0))
{0}
As noted, the elements of a set must be hashable, so these will fail:
>>> {1, 2, [3, 4]}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
>>> {{1: 2}}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'
You can create an empty set with the set constructor and no arguments: set()
. You cannot create an empty set with {}
—this creates an empty dictionary.
Python’s built-in len()
will give you the number of elements in a set, and the keyword in
can be used to test membership.
>>> s = {'apple', 'bear', 'candy', 'duckling'}
>>> len(s)
4
>>> 'apple' in s
True
>>> 'wombat' in s
False
Mutating sets
We can mutate sets with set methods .add()
, .remove()
, .discard()
, and .pop()
(there are other set methods to mutate sets but these will suffice for now).
>>> s = {'apple', 'bear', 'candy', 'duckling'}
>>> s.add('echidna')
>>> s
{'candy', 'duckling', 'echidna', 'apple', 'bear'}
If we try to add an element that’s already in the set, nothing happens.
>>> s = {'apple', 'bear', 'candy', 'duckling'}
>>> s.add('echidna')
>>> s
{'candy', 'duckling', 'echidna', 'apple', 'bear'}
>>> s.add('echidna')
>>> s
{'candy', 'duckling', 'echidna', 'apple', 'bear'}
This does not cause an error, it just fails silently.
We can use .remove()
to remove an element from a set if it exists. If the element does not exist in the set .remove()
will result in a KeyError
. If we wish to discard an element if it exists, or fail silently if it does not, we can use .discard()
.
>>> s.remove('candy')
>>> s
{'duckling', 'echidna', 'apple', 'bear'}
>>> s.remove('wombat')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'wombat'
>>> s.discard('wombat')
Because sets are unordered, .pop()
will pop an arbitrary element from a list.
>>> s = {'apple', 'bear', 'candy', 'duckling'}
>>> s.pop()
'bear'
>>> s.pop()
'candy'
As with dictionaries, we cannot pop from an empty set.
>>> s = set()
>>> s.pop()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'pop from an empty set'
Sets are unordered. Can we iterate over a set?
Yes, we can but we don’t always get elements in the order we might expect.
for element in some_set: # this is just fine
print(element)
Comprehension check
Is order of elements always preserved when constructing a set?
Can a tuple containing a list of strings be an element within a set?
Write a set literal listing your three favorite movie titles.
Set operations
Now here’s where sets get interesting. Python supports basic set operations such as subset, union, intersection, difference, and others.
In mathematics, set A is a subset of set B if all the elements of A are also elements of B. We write this A \subseteq B. We write this A \subsetneq B if we mean A is a strict subset of B—all the elements of A are also elements of B, but there’s at least one element in B not in A. (You may have seen the notation A \subset B. Alas, this is ambiguous and usage varies by author. Sometimes this is equivalent to \subseteq, sometimes this is equivalent to \subsetneq. I’m going to stick with the unambiguous forms.)
The union of two sets, A and B, is the set containing all elements of A and all elements of B. We write this A \cup B.
The intersection of two sets, A and B, is the set containing only those elements which are in both A and B. We write this A \cap B.
The difference of two sets, A and B, written A \setminus B is all the elements in A that are not also in B.
.issubset()
is used to determine if one set is a subset of another. .union()
takes the union of two sets. .intersection()
takes the intersection of two sets. .difference()
takes the difference of two sets.
>>> s1 = {'red', 'green', 'blue'}
>>> s2 = {'cyan', 'blue', 'magenta', 'yellow'}
>>> s3 = {'red', 'green'}
>>> s1.issubset(s2) # is s1 a subset of s2?
False
>>> s3.issubset(s1) # is s3 a subset of s1?
True
>>> s1.union(s2) # take the union of s1 and s2
{'green', 'magenta', 'cyan', 'red', 'yellow', 'blue'}
>>> s2.union(s1) # same
{'green', 'magenta', 'cyan', 'red', 'yellow', 'blue'}
>>> s1.intersection(s2) # take the intersection of s1 and s2
{'blue'}
>>> s2.intersection(s3) # this gets us the empty set
set()
>>> s1.difference(s2) # set difference: s1 \ s2
{'red', 'green'}
Notice that in the case of .issubset()
and .difference()
order is significant, but in the case of .union()
and .intersection()
it is not.
>>> s1 = {'red', 'green', 'blue'}
>>> s2 = {'cyan', 'blue', 'magenta', 'yellow'}
>>> s3 = {'red', 'green'}
>>> s1.issubset(s3)
False
>>> s3.issubset(s1)
True
>>> s1.difference(s3)
{'blue'}
>>> s3.difference(s1)
set()
>>> s1.union(s2)
{'green', 'magenta', 'cyan', 'red', 'yellow', 'blue'}
>>> s2.union(s1)
{'green', 'magenta', 'cyan', 'red', 'yellow', 'blue'}
>>> s1.intersection(s2)
{'blue'}
>>> s2.intersection(s1)
{'blue'}
We can test for equivalence of two sets with ==
, just like you’d expect. Notice that unlike comparing lists and tuples, when comparing sets the order of elements is irrelevant. This is because lists and tuples are ordered, but sets are not.
>>> {1, 2, 3} == {3, 2, 1}
True
All that matters is whether the two sets contain the same elements, regardless of order.
The empty set is a subset of every set!
In mathematics, the empty set is a subset of any set. Why? Because all the elements in the empty set are elements of any set. Why? Because there aren’t any. We call this vacuously true. The same holds true in Python—an empty set is a subset of any set.
>>> s1 = {'red', 'green', 'blue'}
>>> s2 = {'cyan', 'blue', 'magenta', 'yellow'}
>>> s3 = {'red', 'green'}
>>> empty_set = set()
>>> empty_set.issubset(s1)
True
>>> empty_set.issubset(s2)
True
>>> empty_set.issubset(s3)
True
Differences between Python sets and sets in mathematics
In mathematics, there is exactly one empty set—the empty set, notated \emptyset. In Python, we can create multiple empty sets.
>>> s1 = set()
>>> s2 = set()
>>> s1 == s2 # equivalent---they have the same elements
True
>>> s1 is s2 # but not the same object
False
The big difference between Python sets and sets in mathematics is that Python sets must be finite. Unlike mathematical sets, they cannot have an infinite number of elements. For example, the set of all natural numbers, \mathbb{N}, is infinite, as are the sets of all integers (\mathbb{Z}), rational numbers (\mathbb{Q}), and real numbers (\mathbb{R}). For reasons I hope are obvious, we cannot have a set which contains an infinite number of elements in Python (is your computer infinite? No.)
We can represent infinite sets in Python, but not as concrete sets as we’ve shown here. Representing infinite sets on a finite machine is interesting, but it’s outside the scope of this book.
Comprehension check
Another way of thinking of set equivalence is that two sets A and B are equivalent if A \subseteq B and B \subseteq A. Use this fact to write a Python expression which compares two sets for equivalence without using
==
or!=
.The symmetric difference of two sets A and B consists of all the elements in A or B but not in both. Python provides a method for taking the symmetric difference of two sets, but you don’t strictly need it. Write a Python expression that evaluates to the symmetric difference of two sets.
Some practical applications of sets
Let’s say we wanted to find the favorite songs two people have in common. Assume we have two sets of song titles: Bibi’s 100 favorite songs, and Liv’s 100 favorite songs. To find what they have in common, we could construct a third set in a loop:
= set()
common_favorites for s1 in bibis_favorites:
for s2 in livs_favorites:
if s1 == s2:
common_favorites.append(s1)
or perhaps a little better:
= []
common_favorites for s1 in bibis_favorites:
if s1 in livs_favorites:
common_favorites.append(s1)
Here’s a better way with sets:
= bibis_favorites.intersect(livs_favorites) common_favorites
Or say you had a card game and your program needed to know what ranks don’t appear in either player’s hand? Assume ingrid
is the set of all ranks in Ingrid’s hand, and karen
is the set of all cards in Karen’s hand.
= set("A123456789JQK")
all_ranks = all_ranks.difference(ingrid.union(karen)) not_in_either
Copyright © 2023–2025 Clayton Cafiero
No generative AI was used in producing this material. This was written the old-fashioned way.