Regex (Regular Expression)

A mini-language for finding patterns in text—powerful but cryptic.

3 min read

What is Regex?

A regular expression (regex or regexp) is a pattern that describes a set of strings. It's like a search query on steroids—instead of searching for exact text, you search for patterns.

/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/

That mess above? It matches email addresses. Regex is powerful, but it has a reputation for being write-only code.

Basic Syntax

PatternMatchesExample
helloLiteral text"hello" in "hello world"
.Any single character"h.t" matches "hat", "hit", "hot"
\dAny digit (0-9)"\d\d\d" matches "123"
\wWord character (a-z, A-Z, 0-9, _)"\w+" matches "hello_123"
\sWhitespace (space, tab, newline)"hello\sworld"
^Start of string"^hello" matches "hello world"
$End of string"world$" matches "hello world"

Quantifiers

PatternMeaningExample
*0 or morea* matches "", "a", "aaa"
+1 or morea+ matches "a", "aaa" (not "")
?0 or 1colou?r matches "color", "colour"
{3}Exactly 3\d{3} matches "123"
{2,4}2 to 4\d{2,4} matches "12", "123", "1234"

Character Classes

regex
[abc]     # Matches a, b, or c
[^abc]    # Matches anything except a, b, or c
[a-z]     # Matches any lowercase letter
[A-Za-z]  # Matches any letter
[0-9]     # Matches any digit (same as \d)

Common Patterns

regex
# Email (simplified)
[\w.-]+@[\w.-]+\.\w+

# URL
https?://[\w.-]+(/[\w.-]*)*

# Phone (US)
\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}

# Date (YYYY-MM-DD)
\d{4}-\d{2}-\d{2}

# IP Address (simplified)
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}

# Hex Color
#[0-9A-Fa-f]{6}

Groups and Capturing

Parentheses create groups that capture matched text:

regex
# Capture area code from phone number
\((\d{3})\) \d{3}-\d{4}
# Input: "(415) 555-1234"
# Group 1 captures: "415"

# Named groups (modern regex)
(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})

Where You'll See This

  • Form validation - Email, phone, credit card patterns
  • Search and replace - IDE find/replace, sed, grep
  • Log parsing - Extracting timestamps, IPs, errors
  • Web scraping - Pulling data from HTML
  • URL routing - Express, Django, Rails routes

Common Gotchas

⚠️Escape Special Characters

Characters like . * + ? ^ $ [ ] ( ) { } | \ have special meaning. To match them literally, escape with backslash: \. matches a period.

  • Greedy by default - .* matches as much as possible. Use .*? for non-greedy.
  • Different flavors - JavaScript, Python, and PCRE regex have subtle differences.
  • Backtracking - Complex patterns can be slow. (a+)+ on "aaaaaaaaaaaaaaaaaaaaaa!" is catastrophic.
  • Not for HTML - Don't parse HTML with regex. Use a proper parser.

In Code

javascript
// Test if string matches
/\d+/.test("abc123")  // true

// Find first match
"abc123".match(/\d+/)  // ["123"]

// Find all matches
"a1b2c3".match(/\d/g)  // ["1", "2", "3"]

// Replace
"hello world".replace(/world/, "regex")  // "hello regex"

Try It

Test Regex

"Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems." — Jamie Zawinski