← Back to DevDecoder

Regex lookahead and lookbehind, explained

Zero-width assertions are the regex feature that unlocks patterns you can't write any other way.

Most regex syntax describes characters in the target string that a match consumes. You match abc, the regex engine moves its cursor three characters forward. Lookahead and lookbehind are different: they assert something about the surrounding text without consuming any of it. The cursor doesn't move. These are called "zero-width assertions," and they're the feature that makes possible a specific category of patterns — password validation, "word X but only when followed by Y," and precise extraction from structured text.

Every example below is tested against the JavaScript regex engine — the one that runs in browsers, Node, and anywhere you use new RegExp(). You can paste each one into DevDecoder's regex tester to watch it work.

The four assertions

Lookaheads have been in JavaScript forever. Lookbehinds were added in ES2018 and are supported in all modern browsers.

The pattern that makes the concept click

Say you want to match a number that's followed by the word "dollars" — but you only want the number in your match, not the word. Without lookahead:

/\d+ dollars/
// matches "500 dollars" — but your captured match is "500 dollars"

With positive lookahead:

/\d+(?= dollars)/
// matches "500 dollars" — but your captured match is just "500"

The (?= dollars) says "check that ' dollars' comes next." The engine looks, confirms, and leaves the cursor where it was. The dollars text is still in the target string; your match just doesn't include it.

Practical uses

1. Password policies in a single expression

The classic use of lookahead: requiring multiple independent conditions at the start of a string.

// At least 8 chars, at least one digit, one uppercase, one special
/^(?=.*\d)(?=.*[A-Z])(?=.*[^A-Za-z0-9]).{8,}$/

Each lookahead is a separate requirement. None of them consume characters — they all test the whole string from position zero. Then .{8,}$ is the actual matching part: at least eight characters of anything, to the end.

Write each requirement as a lookahead. The assertion fails the match if any requirement fails, without you having to reason about character order.

2. Extract a value from "key=value" without the key

// Grab the value after "token="
/(?<=token=)[^&]+/
// Input: "?user=ada&token=xyz123&env=prod"
// Match: "xyz123"

The lookbehind says "the cursor must be positioned immediately after 'token='." Then [^&]+ matches characters until the next ampersand. The key is part of the assertion, not the match.

3. Find a word, but not when it's part of a bigger word

// Match "is" but not "this", "island", "ism"
/\bis\b/  // This is the standard way, using word boundaries
// But if word boundaries don't fit, assertions work too:
/(?<![A-Za-z])is(?![A-Za-z])/

This is how you'd do it in a regex flavor that lacks \b, or when you need to define "word character" differently than \w.

4. Replace a character only in some contexts

// Remove commas inside numbers, but not between words
// "1,000 apples, 2,500 oranges" -> "1000 apples, 2500 oranges"
str.replace(/(?<=\d),(?=\d)/g, '')

The comma is only matched when preceded by a digit AND followed by a digit. Other commas (the one after "apples") are untouched.

5. Split on a separator while keeping it

// Split on capital letters but keep them: "helloWorldFoo" -> ["hello", "World", "Foo"]
"helloWorldFoo".split(/(?=[A-Z])/)

The lookahead at (?=[A-Z]) splits the string right before each capital letter, and because it's zero-width, the capital letters stay in the output.

Negative assertions in practice

Negative assertions are the tool for "match X, except when surrounded by Y." A common case:

// Match "cat" but not when it's part of "catalog" or "caterpillar"
/cat(?!alog|erpillar)/

// Match URLs but not inside parentheses (e.g. markdown-style links)
/(?<!\()https?:\/\/\S+(?!\))/

Negative assertions are especially useful when you want to exclude a specific shape without enumerating every valid case.

Why lookarounds are zero-width

Consider /foo(?=bar)baz/. Could this ever match? No — after the lookahead asserts "bar comes next," the cursor hasn't moved. So baz is being tested at the same position where bar should be. The two can't both be true. This is the shape of the feature: an assertion doesn't "move" the cursor the way a regular match does.

The right mental model: lookarounds are tests the engine performs, like an if. They produce no output; they only decide whether the match can continue.

Performance notes

Lookarounds themselves are cheap. But they can combine with other regex features to produce expensive patterns:

When you don't need a lookaround

A common beginner pattern is to reach for lookaheads too quickly. Plenty of cases can be expressed with simpler syntax:

Reach for lookarounds when (a) you genuinely need the assertion-style behavior, or (b) the alternative is significantly harder to read.

Browser and language support

FeatureJavaScriptPythonGo (RE2)
Positive / negative lookaheadAlwaysAlwaysNo
Fixed-length lookbehindES2018+AlwaysNo
Variable-length lookbehindES2018+3.7+No

If you're targeting Go's regexp package (which uses RE2), lookarounds aren't available — RE2 trades those features for linear-time guarantees. In that ecosystem you'll usually restructure the pattern using captures.

A debugging tip

When a regex with lookarounds isn't behaving, remove the assertions one by one and re-run against your test string. The position where the match should start is often not where you think. Watching the match evaporate as you add each assertion is the fastest way to identify which one is wrong.

DevDecoder's regex tester shows both the matched regions and the exact character positions in the input, which makes this kind of incremental debugging straightforward.