Introduction

Backreferences match the same text as previously matched by a capturing group. Suppose you want to match a pair of opening and closing HTML tags, and the text in between. By putting the opening tag into a backreference, we can reuse the name of the tag for the closing tag. Count of the opening parentheses gives numbered capturing groups, first parenthesis starts backreference number one, the second number two, etc.

A backreference in a regular expression identifies a previously matched group and looks for exactly the same text again.

 

Example

To make sure that the pattern looks for the closing quote exactly the same as the opening one, we can wrap it into a capturing group and backreference it.  The regex engine finds the first quote ([‘ “]) and memorizes its content.

/(['"])(.*?)\1/g = "She's the one!"

 

Repetition

Regex engine does not permanently substitute backreferences in the regular expression. It will use the last match saved into the backreference each time it needs to be used. If a new match is found by capturing parentheses, the previously saved match is overwritten. Consider regex ([abc]+) and ([abc])+. Both will match cab, the first regex will put cab into the first backreference, while the second regex will only store b. That is because in the second regex, the plus caused the pair of parentheses to repeat three times. The first time, c was stored. The second time, a, and the third time b. Each time, the previous value was overwritten, so b remains.