Regex Cheat Sheet: Every Regular Expression Pattern You Actually Need

A practical reference for regular expressions — anchors, quantifiers, character classes, lookaheads, and real-world patterns for emails, URLs, IPs, and dates.

Regular Expressions at a Glance

A regular expression (regex) is a sequence of characters that defines a search pattern. The same syntax works across JavaScript, Python, Go, Java, PHP, Ruby, and most modern languages — with minor dialect differences.

Learning regex once gives you a tool that works everywhere: in your editor's find-and-replace, command-line tools like grep and sed, log analysis pipelines, form validation, and data extraction scripts.

Anchors

Anchors don't match characters — they match positions.

Pattern	Matches
`^`	Start of string (or line in multiline mode)
`$`	End of string (or line in multiline mode)
`\b`	Word boundary (between \w and \W)
`\B`	Non-word boundary

^hello        matches "hello world" but not "say hello"
world$        matches "hello world" but not "worldwide"
\bcat\b      matches "cat" but not "concatenate"

Character Classes

Pattern	Matches
`.`	Any character except newline
`\d`	Digit: `[0-9]`
`\D`	Non-digit: `[^0-9]`
`\w`	Word char: `[a-zA-Z0-9_]`
`\W`	Non-word char
`\s`	Whitespace: space, tab, newline
`\S`	Non-whitespace
`[abc]`	a, b, or c
`[^abc]`	Anything except a, b, or c
`[a-z]`	Any lowercase letter
`[A-Za-z0-9]`	Alphanumeric

Quantifiers

Pattern	Meaning
`*`	0 or more
`+`	1 or more
`?`	0 or 1 (optional)
`{n}`	Exactly n times
`{n,}`	n or more times
`{n,m}`	Between n and m times
`*?`	0 or more, lazy (non-greedy)
`+?`	1 or more, lazy

Greedy vs lazy: By default, quantifiers are greedy — they match as much as possible. Adding ? makes them lazy — they match as little as possible.

Input: <b>bold</b> and <i>italic</i>

Greedy  <.*>   matches entire string
Lazy    <.*?>  matches <b>, </b>, <i>, </i> separately

Groups and Alternation

Pattern	Meaning
`(abc)`	Capturing group
`(?:abc)`	Non-capturing group
`(?<name>abc)`	Named capturing group
`a	b`
`\1`	Backreference to group 1

Named groups make complex patterns readable:

(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})

Matches 2026-05-21 and lets you access match.groups.year, match.groups.month, match.groups.day.

Lookaheads and Lookbehinds

These assert what comes before or after a position without including it in the match:

Pattern	Meaning
`(?=abc)`	Positive lookahead: followed by abc
`(?!abc)`	Negative lookahead: NOT followed by abc
`(?<=abc)`	Positive lookbehind: preceded by abc
`(?<!abc)`	Negative lookbehind: NOT preceded by abc

\d+(?= dollars)   matches "100" in "100 dollars" but not "100 euros"
(?<=\$)\d+        matches digits after a dollar sign

Flags

Flag	Effect
`g`	Global — find all matches, not just first
`i`	Case-insensitive
`m`	Multiline — ^ and $ match line boundaries
`s`	Dotall — dot matches newline too
`u`	Unicode mode

Real-World Patterns

Email Address (practical)

^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$

Covers the vast majority of real email addresses. Note: the full RFC 5321 spec allows characters most real-world emails never use. This pattern rejects edge cases intentionally.

URL

https?://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,}(/[\w\-\./\?\=\&\#\%]*)?

IPv4 Address

^((25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)\.){3}(25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)$

Validates each octet is 0–255, rejecting 999.0.0.1.

ISO Date (YYYY-MM-DD)

^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$

Hex Color Code

^#([a-fA-F0-9]{6}|[a-fA-F0-9]{3})$

Credit Card Number (basic format check)

^(?:4\d{12}(?:\d{3})?|5[1-5]\d{14}|3[47]\d{13}|6(?:011|5\d{2})\d{12})$

Matches Visa, Mastercard, Amex, Discover formats. Does not validate using Luhn algorithm.

Password (min 8 chars, requires letter + digit)

^(?=.*[A-Za-z])(?=.*\d)[A-Za-z\d@$!%*#?&]{8,}$

Slug (URL-safe string)

^[a-z0-9]+(?:-[a-z0-9]+)*$

Common Mistakes

Forgetting to escape dots: In regex, . matches any character. To match a literal period, write \.. The pattern myfile.txt matches myfileXtxt — usually not what you want.

Catastrophic backtracking: Nested quantifiers like (a+)+ can cause exponential time on certain inputs. Avoid patterns where multiple quantifiers can match the same characters.

Not anchoring validation patterns: Without ^ and $, a pattern like \d{4} matches the four-digit substring anywhere in the input — it won't reject abc1234def.

Using greedy match when lazy is needed: When extracting content between tags, greedy <.*> grabs everything from first open to last close tag. Use <.*?> instead.

Testing Your Patterns

→ Use the Regex Tester to write and test patterns with live highlighting, capture group inspection, and match details.

→ The Regex Cheat Sheet provides a quick-reference card for the most common patterns, all in one scrollable view.

页面加载失败