Lesson 10: Regular Expressions

⏱ ~35 min Lesson 10 of 14 💚 Free

Regular expressions (regex) are patterns for searching and manipulating text. They're used everywhere: validating emails, extracting data from web pages, parsing log files, finding patterns in DNA sequences.

Key Concepts

Basic Patterns

. matches any character. ^ matches start. $ matches end. * = zero or more. + = one or more. ? = zero or one. [abc] = a, b, or c. \d = digit. \w = word char. \s = whitespace.

re Module

import re
re.search(pattern, string) # find first match
re.findall(pattern, string) # list all matches
re.sub(pattern, replacement, string) # replace
re.match() checks from the beginning only.

Groups & Capturing

(\d{3})-(\d{4}) matches 555-1234 and captures 555 and 1234 separately. match.group(1) = '555', match.group(2) = '1234'. Named groups: (?P\d{3})

Practical Uses

Email validation: r'^[\w.-]+@[\w.-]+\.\w{2,}$'
Phone extraction: r'\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}'
URL finding: r'https?://[^\s]+'

✅ Check Your Understanding

1. In regex, \d matches:

2. re.findall(pattern, text) returns:

3. Which pattern matches a valid email address?