Introduction

Set data structure in Python is an unordered collection data type that is iterable, mutable, and has no duplicate elements. The major advantage of using a set, as opposed to a list, is that it has a highly optimized method for checking whether a specific element is contained in the set. This is based on a data structure known as a hash table. Set type has the following characteristics:

  • Sets are unordered.
  • Set elements are unique i.e. it cannot have duplicate elements.
  • A set itself may be modified, but the elements contained in the set must be of an immutable type.

Creating set

A set is created by placing all the items (elements) inside curly braces {}, separated by comma or by using the built-in function set(). It can have any number of items and they may be of different types (integer, float, tuple, string etc.).

# Set of integers
my_set = {1, 2, 3}

# Set of mixed datatypes
my_set = {1.0, "Hello", (1, 2, 3)}

# Set do not have duplicates
my_set = {1,2,3,4,3,2}

# Output: {1, 2, 3, 4}
print(my_set)

# Set cannot have mutable items. [3, 4] is a mutable list
# This will cause an error. TypeError: unhashable type: 'list'
#my_set = {1, 2, [3, 4]}

# Set from a list
my_set = set([1,2,3,2])

Empty curly braces {} will make an empty dictionary in Python. To make a set without any elements we use the set() function without any argument.

# Initialize a with {}
a = {}

# Output: <class 'dict'>
print(type(a))

# Initialize a with set()
a = set()

# Output: <class 'set'>
print(type(a))

Adding element

Sets are mutable. But since they are unordered, indexing have no meaning. We cannot access or change an element of set using indexing or slicing. We can add single element using the add() method and multiple elements using the update() method.

my_set = {1,3}

# Add an element
my_set.add(2)

# Output: {1, 2, 3}
print(my_set)

# Add multiple elements
my_set.update([2,3,4])

# Output: {1, 2, 3, 4}
print(my_set)

# Add list and set
my_set.update([4,5], {1,6,8})

# Output: {1, 2, 3, 4, 5, 6, 8}
print(my_set)

Removing elements

A particular item can be removed from set using methods, discard() and remove(). The only difference between the two is that, while using discard() if the item does not exist in the set, it remains unchanged. But remove() will raise an error in such condition. Similarly, we can remove and return an item using the pop() method. We can also remove all items from a set using clear().

my_set = {1, 3, 4, 5, 6}

# Discard an element
my_set.discard(4)

# Output: {1, 3, 5, 6}
print(my_set)

# remove an element
my_set.remove(6)

# Output: {1, 3, 5}
print(my_set)

# Discard an element not present in my_set
my_set.discard(2)

# Output: {1, 3, 5}
print(my_set)

# Output: random element
print(my_set.pop())

# clear my_set
my_set.clear()

Operating on Set

Sets can be used to carry out mathematical set operations like union, intersection, difference and symmetric difference. We can do this with operators or methods.

  • Union of A and B is a set of all elements from both sets. Union is performed using | operator. Same can be accomplished using the method union().
  • Intersection of A and B is a set of elements that are common in both sets. Intersection is performed using & operator. Same can be accomplished using the method intersection().
  • Difference of A and B (A – B) is a set of elements that are only in A but not in B.  Difference is performed using – operator. Same can be accomplished using the method difference().
  • Symmetric Difference of A and B is a set of elements in both A and B except those that are common in both. Symmetric difference is performed using ^ operator. Same can be accomplished using the method symmetric_difference().
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}

# use union function
A.union(B)

# Output: {1, 2, 3, 4, 5, 6, 7, 8}
print(A | B)

# use intersection function on A
A.intersection(B)

# Output: {4, 5}
print(A & B) 

# use difference function on A
A.difference(B)

# Output: {1, 2, 3}
print(A - B)

# use symmetric_difference function on A
A.symmetric_difference(B)

# Output: {1, 2, 3, 6, 7, 8}
print(A ^ B)

Frozenset

Frozenset is a new class that has the characteristics of a set, but its elements cannot be changed once assigned. While tuples are immutable lists, frozensets are immutable sets. A set is unhashable, so it can’t be used as dictionary keys. On the other hand, frozensets are hashable and can be used as keys to a dictionary. Frozensets can be created using the function frozenset().

# Initialize A and B
A = frozenset([1, 2, 3, 4])
B = frozenset([3, 4, 5, 6])