
Languages like PHP, Delphi, and R that implement their regex support using PCRE also support all this syntax. Old versions of PCRE supported the Python syntax, even though that was not “Perl-compatible” at the time. PCRE 7.2 and later support all the syntax for named capture and backreferences that Perl 5.10 supports. It also adds two more syntactic variants for named backreferences: \k to insert the text matched by a named capturing group. NET syntax for named capture and backreferences. Perl 5.10 added support for both the Python and. Today, many other regex flavors have copied this syntax. NET introduced their own syntax, we refer to these two variants as the “Python syntax” and the “.NET syntax” for named capture and named backreferences. The syntax using angle brackets is preferable in programming languages that use single quotes to delimit strings, while the syntax using single quotes is preferable when adding your regex to an XML file, as this minimizes the amount of escaping you have to do to format your regex as a literal string or as XML content.īecause Python and. This makes absolutely no difference in the regex. You can use single quotes or angle brackets around the name. The syntax for named backreferences is more similar to that of numbered backreferences than to what Python uses. Compared with Python, there is no P in the syntax for named groups. The named backreference is \k or \k'name'. (? group ) or (?'name' group ) captures the match of group into the backreference “name”. Microsoft’s developers invented their own syntax, rather than follow the one pioneered by Python and copied by PCRE (the only two regex engines that supported named capture at that time). The HTML tags example can be written as * ) \b * >. Though the syntax for the named backreference uses parentheses, it’s just a backreference that doesn’t do any capturing or grouping. The question mark, P, angle brackets, and equals signs are all part of the syntax. You can reference the contents of the group with the named backreference (?P=name). name must be an alphanumeric sequence starting with a letter.

(?P group ) captures the match of group into the backreference “name”. Python’s re module was the first to offer a solution: named capturing groups and named backreferences. They can be particularly difficult to maintain as adding or removing a capturing group in the middle of the regex upsets the numbers of all the groups that follow the added or removed group.

Long regular expressions with lots of groups and backreferences may be hard to read. Nearly all modern regular expression engines support numbered capturing groups and numbered backreferences. Named Capturing Groups and Backreferences
