Regular Expressions at a Glance
A regular expression (regex) is a sequence of characters that defines a search pattern. The same syntax works across JavaScript, Python, Go, Java, PHP, Ruby, and most modern languages — with minor dialect differences.
Learning regex once gives you a tool that works everywhere: in your editor's find-and-replace, command-line tools like grep and sed, log analysis pipelines, form validation, and data extraction scripts.
Anchors
Anchors don't match characters — they match positions.
| Pattern | Matches |
|---|---|
^ |
Start of string (or line in multiline mode) |
$ |
End of string (or line in multiline mode) |
\b |
Word boundary (between \w and \W) |
\B |
Non-word boundary |
^hello matches "hello world" but not "say hello"
world$ matches "hello world" but not "worldwide"
\bcat\b matches "cat" but not "concatenate"
Character Classes
| Pattern | Matches |
|---|---|
. |
Any character except newline |
\d |
Digit: [0-9] |
\D |
Non-digit: [^0-9] |
\w |
Word char: [a-zA-Z0-9_] |
\W |
Non-word char |
\s |
Whitespace: space, tab, newline |
\S |
Non-whitespace |
[abc] |
a, b, or c |
[^abc] |
Anything except a, b, or c |
[a-z] |
Any lowercase letter |
[A-Za-z0-9] |
Alphanumeric |
Quantifiers
| Pattern | Meaning |
|---|---|
* |
0 or more |
+ |
1 or more |
? |
0 or 1 (optional) |
{n} |
Exactly n times |
{n,} |
n or more times |
{n,m} |
Between n and m times |
*? |
0 or more, lazy (non-greedy) |
+? |
1 or more, lazy |
Greedy vs lazy: By default, quantifiers are greedy — they match as much as possible. Adding ? makes them lazy — they match as little as possible.
Input: <b>bold</b> and <i>italic</i>
Greedy <.*> matches entire string
Lazy <.*?> matches <b>, </b>, <i>, </i> separately
Groups and Alternation
| Pattern | Meaning |
|---|---|
(abc) |
Capturing group |
(?:abc) |
Non-capturing group |
(?<name>abc) |
Named capturing group |
| `a | b` |
\1 |
Backreference to group 1 |
Named groups make complex patterns readable:
(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})
Matches 2026-05-21 and lets you access match.groups.year, match.groups.month, match.groups.day.
Lookaheads and Lookbehinds
These assert what comes before or after a position without including it in the match:
| Pattern | Meaning |
|---|---|
(?=abc) |
Positive lookahead: followed by abc |
(?!abc) |
Negative lookahead: NOT followed by abc |
(?<=abc) |
Positive lookbehind: preceded by abc |
(?<!abc) |
Negative lookbehind: NOT preceded by abc |
\d+(?= dollars) matches "100" in "100 dollars" but not "100 euros"
(?<=\$)\d+ matches digits after a dollar sign
Flags
| Flag | Effect |
|---|---|
g |
Global — find all matches, not just first |
i |
Case-insensitive |
m |
Multiline — ^ and $ match line boundaries |
s |
Dotall — dot matches newline too |
u |
Unicode mode |
Real-World Patterns
Email Address (practical)
^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$
Covers the vast majority of real email addresses. Note: the full RFC 5321 spec allows characters most real-world emails never use. This pattern rejects edge cases intentionally.
URL
https?://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,}(/[\w\-\./\?\=\&\#\%]*)?
IPv4 Address
^((25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)\.){3}(25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)$
Validates each octet is 0–255, rejecting 999.0.0.1.
ISO Date (YYYY-MM-DD)
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
Hex Color Code
^#([a-fA-F0-9]{6}|[a-fA-F0-9]{3})$
Credit Card Number (basic format check)
^(?:4\d{12}(?:\d{3})?|5[1-5]\d{14}|3[47]\d{13}|6(?:011|5\d{2})\d{12})$
Matches Visa, Mastercard, Amex, Discover formats. Does not validate using Luhn algorithm.
Password (min 8 chars, requires letter + digit)
^(?=.*[A-Za-z])(?=.*\d)[A-Za-z\d@$!%*#?&]{8,}$
Slug (URL-safe string)
^[a-z0-9]+(?:-[a-z0-9]+)*$
Common Mistakes
Forgetting to escape dots: In regex, . matches any character. To match a literal period, write \.. The pattern myfile.txt matches myfileXtxt — usually not what you want.
Catastrophic backtracking: Nested quantifiers like (a+)+ can cause exponential time on certain inputs. Avoid patterns where multiple quantifiers can match the same characters.
Not anchoring validation patterns: Without ^ and $, a pattern like \d{4} matches the four-digit substring anywhere in the input — it won't reject abc1234def.
Using greedy match when lazy is needed: When extracting content between tags, greedy <.*> grabs everything from first open to last close tag. Use <.*?> instead.
Testing Your Patterns
→ Use the Regex Tester to write and test patterns with live highlighting, capture group inspection, and match details.
→ The Regex Cheat Sheet provides a quick-reference card for the most common patterns, all in one scrollable view.