BLOG
Regex Cheat Sheet: The Patterns, Syntax, and Examples You Actually Need
There is a famous joke among programmers: "A developer had a problem and decided to use regex. Now they have two problems." Regex has earned its intimidating reputation. But the truth is, most real-world regex work uses maybe 20% of the syntax. Learn that 20%, and you can validate inputs, extract data, search codebases, and transform text faster than any alternative.
This cheat sheet is organized by practical use. Instead of listing every regex token alphabetically (which is how most cheat sheets work, and why most cheat sheets get bookmarked and never read), we start with the building blocks, then move to copy-paste patterns for common tasks. Test any pattern instantly with the Regex Tester, which highlights matches in real time.
Core Syntax Reference
Character Classes
| Pattern | Matches | Example |
|---|---|---|
. | Any character except newline | h.t matches "hat", "hit", "hot" |
\d | Any digit (0-9) | \d{3} matches "123", "456" |
\D | Any non-digit | \D+ matches "abc", "hello" |
\w | Word character (a-z, A-Z, 0-9, _) | \w+ matches "hello_world" |
\W | Non-word character | \W matches "@", " ", "!" |
\s | Whitespace (space, tab, newline) | \s+ matches " ", "\t\n" |
\S | Non-whitespace | \S+ matches "hello" |
[abc] | Any one of a, b, or c | [aeiou] matches vowels |
[^abc] | Any character NOT a, b, or c | [^0-9] matches non-digits |
[a-z] | Any lowercase letter | [a-zA-Z] matches any letter |
Quantifiers
| Pattern | Meaning | Example |
|---|---|---|
* | Zero or more | ab*c matches "ac", "abc", "abbc" |
+ | One or more | ab+c matches "abc", "abbc" but not "ac" |
? | Zero or one | colou?r matches "color" and "colour" |
{n} | Exactly n times | \d{4} matches "2026" |
{n,} | n or more times | \w{3,} matches words with 3+ chars |
{n,m} | Between n and m times | \d{2,4} matches "12", "123", "1234" |
*? | Zero or more (lazy) | <.*?> matches shortest HTML tag |
+? | One or more (lazy) | ".+?" matches shortest quoted string |
Anchors and Boundaries
| Pattern | Matches | Example |
|---|---|---|
^ | Start of string/line | ^Hello matches "Hello" at line start |
$ | End of string/line | end$ matches "the end" |
\b | Word boundary | \bcat\b matches "cat" but not "catch" |
\B | Non-word boundary | \Bcat\B matches "concatenate" |
Groups and Alternation
| Pattern | Meaning | Example |
|---|---|---|
(abc) | Capture group | (\d{3})-(\d{4}) captures "555" and "1234" |
(?:abc) | Non-capture group | (?:Mr|Mrs)\. matches but does not capture |
a|b | a OR b | cat|dog matches either |
\1 | Backreference to group 1 | (\w+)\s\1 matches repeated words |
(?=abc) | Positive lookahead | \d(?=px) matches "5" in "5px" |
(?!abc) | Negative lookahead | \d(?!px) matches "5" NOT followed by "px" |
(?<=abc) | Positive lookbehind | (?<=\$)\d+ matches "50" in "$50" |
Flags
| Flag | Name | Effect |
|---|---|---|
g | Global | Find all matches, not just the first |
i | Case-insensitive | /hello/i matches "Hello", "HELLO" |
m | Multiline | ^ and $ match line boundaries, not just string boundaries |
s | Dotall | . also matches newlines |
u | Unicode | Enables full Unicode matching |
Copy-Paste Patterns for Common Tasks
Email Validation
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
This covers 99% of real-world email addresses. A perfectly RFC-compliant email regex is over 6,000 characters long and is more of a theoretical exercise than a practical tool. For production use, validate format with the regex above, then send a confirmation email to verify deliverability.
URL Matching
https?:\/\/[^\s/$.?#].[^\s]*
Matches both HTTP and HTTPS URLs. For stricter validation that checks for valid TLDs and path structures, use a URL parsing library rather than regex.
Phone Number (US)
^(\+1)?[-.\s]?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$
Handles formats like (555) 123-4567, 555-123-4567, 555.123.4567, and +1 555 123 4567.
Date Formats
# YYYY-MM-DD
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
# MM/DD/YYYY
^(0[1-9]|1[0-2])\/(0[1-9]|[12]\d|3[01])\/\d{4}$
These validate the format but not the logic (they will accept February 31st). For full date validation, parse the date with a date library after the regex check.
IPv4 Address
^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$
Validates that each octet is 0-255. Many simpler patterns accept invalid addresses like 999.999.999.999.
HTML Tags
<([a-z][a-z0-9]*)\b[^>]*>(.*?)<\/\1>
Matches opening and closing HTML tag pairs. But here is the obligatory warning: do not parse HTML with regex for anything serious. HTML is not a regular language, and regex cannot handle nested tags, self-closing tags, and malformed markup correctly. Use a DOM parser instead.
Hexadecimal Color Code
^#([0-9A-Fa-f]{3}|[0-9A-Fa-f]{6}|[0-9A-Fa-f]{8})$
Matches 3-digit (#FFF), 6-digit (#FF00FF), and 8-digit (#FF00FF80) hex color codes.
Password Strength
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
Requires at least 8 characters with one uppercase, one lowercase, one digit, and one special character. Note: length is far more important than complexity for password security, so consider just checking for .{12,} (12+ characters of anything) instead.
Finding Duplicate Words
\b(\w+)\s+\1\b
Catches repeated words like "the the" or "is is." Useful for proofreading. The \1 backreference matches whatever the first capture group found.
Translating Regex to Plain English
When you encounter a regex pattern you cannot parse, the Regex to English translator breaks it down into readable language. Paste in ^(?=.*[A-Z])(?=.*\d).{8,}$ and get an explanation like: "Start of string, followed by at least one uppercase letter anywhere, followed by at least one digit anywhere, then 8 or more of any character, then end of string."
This is especially helpful when inheriting code with complex validation patterns written by someone who no longer works on the project.
Working with Text Case and Regex
Regex patterns often need to work alongside text transformations. When you are cleaning up data or normalizing strings, the Case Converter handles common transformations like uppercase, lowercase, title case, camelCase, and snake_case. Combine it with regex: use a pattern to extract the relevant strings, then convert their case for consistency.
Regex Performance Tips
Avoid catastrophic backtracking. Nested quantifiers like (a+)+ can cause exponential processing time on certain inputs. This pattern looks harmless, but on a string like "aaaaaaaaaaaaaaaaab", the engine tries every possible way to partition the "a"s between the inner and outer group before concluding it cannot match. On a 30-character string, this can take seconds. On a 50-character string, it can freeze your browser.
Be specific. .* is tempting but slow. If you know the characters you are looking for, use a character class. [^"]* (any character that is not a quote) is much faster than .*? when matching the contents of a quoted string.
Use non-capturing groups when you do not need the match. (?:abc) is slightly faster than (abc) because the engine does not need to store the matched text for later reference.
Anchor when possible. If you know the pattern should match at the start of the string, use ^. This lets the engine fail fast instead of trying every position in the string.
Test your patterns against realistic input data using the Regex Tester. It shows not just whether the pattern matches, but which groups are captured and how many matches exist, giving you immediate feedback as you refine the expression.