Introduction
Quantifiers can be used to specify number of times a token should be matched by the regex engine. Below are common qualifier used in regular expression
- ? : Match the token zero times or exactly once
- * : Match the token zero or more times
- + : Match the token one or more times
- {m,n} : Match the token between m and n (both including) times, where m and n are natural numbers and n ≥ m.
Greedy Quantifier
A greedy quantifier always attempts to repeat the sub-pattern as many times as possible before exploring shorter matches by backtracking. Generally, a greedy pattern will match the longest possible string. By default, all quantifiers are greedy.
Lazy Quantifier
A lazy quantifier always attempts to repeat the sub-pattern as few times as possible, before exploring longer matches by expansion. Generally, a lazy pattern will match the shortest possible string. To make quantifiers lazy, append ? to the existing quantifier i.e.
- ??
- *?
- +?
- {m,n}?
Example
Consider below string
aaaaaAlazyZgreeedyAlaaazyZaaaaa
Greedy repetition A.*Z yields 1 match: AlazyZgreeedyAlaaazyZ while lazy pattern A.*?Z yields 2 matches: AlazyZ and AlaaazyZ .
In Greedy search, when it matched the first A, the .*, being greedy, tries to match as many . as possible.
// 1 aaaaaAlazyZgreeedyAlaaazyZaaaaa \________________________/ A.* matched // 2 aaaaaAlazyZgreeedyAlaaazyZaaaaa \_______________________/ A.* matched, Z can't match
After this match whole string is used, since the Z is still left to match, the engine backtracks, and .* must then match one fewer (2). This happens a few more times, until it finally comes as given below. Now Z can match, hence the overall pattern matches as mentioned earlier.
aaaaaAlazyZgreeedyAlaaazyZaaaaa \__________________/ A.* matched, Z can now match
While Lazy repetition in A.*?Z, first matches as few . as possible, and then taking more . as necessary. This explains why it finds two matches in the input. Below is the visual representation of the same
aaaaaAlazyZgreeedyAlaaazyZaaaaa \____/ \______/