Dictionaries & Sets

Lesson 7 · CS 111: Programming I — Python · OKSTEM College

Dictionaries

A dictionary stores key-value pairs. Keys must be unique and immutable (strings, numbers, tuples). Values can be anything.

student = { "name": "Alice", "age": 20, "gpa": 3.8 } # Access student["name"] # "Alice" student.get("grade", "N/A") # "N/A" - safe default # Add / update student["major"] = "CS" # adds new key student["age"] = 21 # updates existing # Delete del student["gpa"] popped = student.pop("major") # Iteration for key in student: print(key, student[key]) for key, value in student.items(): print(f"{key}: {value}") # Useful methods student.keys() # dict_keys view student.values() # dict_values view "name" in student # True - checks keys

Sets

A set is an unordered collection of unique items. Duplicates are automatically removed.

a = {1, 2, 3, 4} b = {3, 4, 5, 6} a | b # {1,2,3,4,5,6} union a & b # {3,4} intersection a - b # {1,2} difference (in a, not b) a ^ b # {1,2,5,6} symmetric difference # Membership is O(1) - much faster than list search 3 in a # True # Remove duplicates from a list instantly nums = [1, 2, 2, 3, 3, 3] unique = list(set(nums)) # [1, 2, 3]

When to use what: List for ordered data you'll iterate. Dict for labeled data (key-value). Set for fast membership tests or eliminating duplicates. Tuple for fixed, immutable data.

Worked Example - Word frequency counter

Start with a sentence string:
text = "the cat sat on the mat the cat"

Split into words: words = text.split() gives a list of words.

Count with a dict:

freq = {} for word in words: freq[word] = freq.get(word, 0) + 1

freq.get(word, 0) returns the current count or 0 if the word isn't in the dict yet.

Result: {"the": 3, "cat": 2, "sat": 1, "on": 1, "mat": 1}

Find most common: max(freq, key=freq.get) returns "the".

Practice Problems

Problem 1 - Phone book lookup

Create a dict mapping names to phone numbers. Write code to look up a name and print the number, or "Not found" if the name isn't in the book.

phonebook = { "Alice": "555-1234", "Bob": "555-5678", "Carol": "555-9012" } name = input("Look up: ") print(phonebook.get(name, "Not found"))

.get(key, default) is safer than [key] because it returns the default instead of raising KeyError when the key doesn't exist.

Problem 2 - Find common elements

Given two lists, find all elements that appear in both. Use sets for O(1) lookup.

a = [1, 2, 3, 4, 5] b = [3, 4, 5, 6, 7] common = list(set(a) & set(b)) print(common) # [3, 4, 5]

Set intersection (&) is O(min(len(a), len(b))) vs the naive O(n*m) of nested loops. For large lists this is dramatically faster.

Knowledge Check

What does d.get('key', 'default') do if 'key' is not in d?

That is what d['key'] does, not d.get().

Only if you call d.get('key') with no default.

Correct - .get() never raises KeyError.

get() does not modify the dict. Use setdefault() for that.

Quick Recap

d['key'] raises KeyError for missing keys. d.get('key', 'default') safely returns the default value instead.

Quick Recap

d.get(key) with no default returns None. d.get(key, 'default') returns 'default' when key is missing.

Quick Recap

.get() is read-only. To create a key with a default if missing, use d.setdefault('key', 'default').

Why are sets faster than lists for membership testing (x in s)?

Sets have no order or sorting.

Correct.

Sets don't inherently store less data.

Both lists and sets are interpreted the same way.

Quick Recap

Sets are unordered. Their speed comes from hashing, not sorting.

Quick Recap

A set may use more memory than a list due to hash table overhead. The advantage is O(1) lookup time.

Quick Recap

Speed comes from the data structure (hash table), not compilation. List search is O(n); set search is O(1).

What does {1, 2, 3} & {2, 3, 4} return?

That is the union (|), not intersection (&).

Correct - & is intersection, elements in both sets.

That is the symmetric difference (^).

The sets share elements 2 and 3.

Quick Recap

| is union (all elements from both). & is intersection (only shared elements). {1,2,3} & {2,3,4} = {2,3}.

Quick Recap

^ gives elements in one set or the other but not both. & gives elements in both sets.

Quick Recap

{} would mean no common elements. But 2 and 3 appear in both sets, so intersection = {2,3}.

Which collection maintains insertion order and allows duplicates?

Sets are unordered and have no duplicates.

Dicts maintain insertion order (Python 3.7+) but keys must be unique.

Correct - lists are ordered and allow duplicates.

Tuples maintain order and allow duplicates, but are immutable.

Quick Recap

Sets: unordered, unique elements, O(1) lookup. Lists: ordered, allow duplicates.

Quick Recap

Dict keys are unique, but values can repeat. Lists allow duplicate elements.

Quick Recap

Tuples do maintain order and allow duplicates, but they're immutable. Lists are the mutable ordered sequence.

What is the output of len({1, 1, 2, 2, 3})?

Sets automatically remove duplicates.

Correct - sets store only unique values, so 3 elements.

2 would be the count of duplicates removed, not the set size.

This is valid Python.

Quick Recap

The set {1,1,2,2,3} stores only unique values: {1,2,3}. len({1,2,3}) = 3.

Quick Recap

The set ends up with 3 unique elements: 1, 2, 3. len = 3.

Quick Recap

Sets of integers are valid. The set literal removes duplicates automatically. len returns 3.