Set

Understand Python sets - unordered collections of unique elements with powerful set operations

Overview

Sets are unordered collections of unique elements. They are mutable (can be modified) and are particularly useful for membership testing, removing duplicates, and mathematical operations like union and intersection.

Creating Sets

PYTHON
# Empty set (note: {} creates an empty dict, not set)
empty_set = set()

# Set with initial values
numbers = {1, 2, 3, 4, 5}
fruits = {'apple', 'banana', 'cherry'}

# From list (removes duplicates)
my_list = [1, 2, 2, 3, 3, 4]
unique_numbers = set(my_list)  # {1, 2, 3, 4}

# Set comprehension
squares = {x**2 for x in range(10)}

# From string (creates set of characters)
chars = set('hello')  # {'h', 'e', 'l', 'o'}

Basic Operations

Adding Elements

PYTHON
fruits = {'apple', 'banana'}

# Add single element
fruits.add('cherry')  # {'apple', 'banana', 'cherry'}

# Add multiple elements
fruits.update(['orange', 'grape'])
fruits.update({'kiwi', 'mango'})

Removing Elements

PYTHON
fruits = {'apple', 'banana', 'cherry', 'orange'}

# Remove element (raises KeyError if not found)
fruits.remove('banana')

# Remove element (no error if not found)
fruits.discard('grape')

# Remove and return arbitrary element
item = fruits.pop()

# Clear all elements
fruits.clear()

Set Operations

Mathematical Operations

PYTHON
A = {1, 2, 3, 4}
B = {3, 4, 5, 6}

# Union (all unique elements from both sets)
union = A | B  # {1, 2, 3, 4, 5, 6}
union = A.union(B)

# Intersection (common elements)
intersection = A & B  # {3, 4}
intersection = A.intersection(B)

# Difference (elements in A but not in B)
difference = A - B  # {1, 2}
difference = A.difference(B)

# Symmetric difference (elements in either A or B, but not both)
sym_diff = A ^ B  # {1, 2, 5, 6}
sym_diff = A.symmetric_difference(B)

Set Comparisons

PYTHON
A = {1, 2, 3}
B = {1, 2, 3, 4, 5}
C = {1, 2, 3}

# Subset
A.issubset(B)  # True (A ⊆ B)
A <= B  # True

# Superset
B.issuperset(A)  # True (B ⊇ A)
B >= A  # True

# Equality
A == C  # True

# Disjoint (no common elements)
X = {1, 2}
Y = {3, 4}
X.isdisjoint(Y)  # True

Frozen Sets

Immutable version of sets that can be used as dictionary keys or elements of other sets.

PYTHON
# Create frozen set
frozen = frozenset([1, 2, 3, 4])

# Can be used as dictionary key
my_dict = {frozen: 'value'}

# Can be element of another set
set_of_sets = {frozenset([1, 2]), frozenset([3, 4])}

# Operations work the same (but return new frozen sets)
fs1 = frozenset([1, 2, 3])
fs2 = frozenset([3, 4, 5])
union = fs1 | fs2  # frozenset({1, 2, 3, 4, 5})

Common Patterns

Removing Duplicates

PYTHON
numbers = [1, 2, 2, 3, 3, 3, 4, 4, 5]
unique = list(set(numbers))  # [1, 2, 3, 4, 5]

# Preserve order (Python 3.7+)
unique = list(dict.fromkeys(numbers))

Membership Testing

PYTHON
valid_users = {'alice', 'bob', 'charlie'}

# Fast membership check
if 'alice' in valid_users:
    print("Valid user")

# Check multiple memberships
users_to_check = {'alice', 'david'}
valid = users_to_check.issubset(valid_users)

Finding Common Elements

PYTHON
# Common skills between employees
employee1_skills = {'Python', 'SQL', 'Git'}
employee2_skills = {'Python', 'Docker', 'Git'}
common_skills = employee1_skills & employee2_skills  # {'Python', 'Git'}

Performance Characteristics

  • Add: O(1) average case
  • Remove: O(1) average case
  • Membership test: O(1) average case
  • Union: O(len(s) + len(t))
  • Intersection: O(min(len(s), len(t)))
  • Difference: O(len(s))

Set Methods Summary

PYTHON
s = {1, 2, 3}

# Modifying methods
s.add(4)                  # Add single element
s.update([5, 6])          # Add multiple elements
s.remove(2)               # Remove (raises error if not found)
s.discard(7)              # Remove (no error if not found)
s.pop()                   # Remove and return arbitrary element
s.clear()                 # Remove all elements

# Non-modifying methods
s.copy()                  # Shallow copy
s.union(other)            # Union
s.intersection(other)     # Intersection
s.difference(other)       # Difference
s.symmetric_difference(other)  # Symmetric difference

# Comparison methods
s.issubset(other)         # Check if subset
s.issuperset(other)       # Check if superset
s.isdisjoint(other)       # Check if no common elements

Best Practices

1. Use sets for membership testing instead of lists 2. Use sets to remove duplicates from sequences 3. Leverage set operations for data analysis 4. Use frozen sets when you need immutable sets 5. Remember sets are unordered - don't rely on element order

Common Use Cases

  • Removing duplicates from data
  • Fast membership testing
  • Finding unique elements
  • Mathematical set operations
  • Graph algorithms (visited nodes)
  • Permission and role management