Table of contents
- XRegExp
- XRegExp.addToken
- XRegExp.build
- XRegExp.cache
- XRegExp.escape
- XRegExp.exec
- XRegExp.forEach
- XRegExp.globalize
- XRegExp.install
- XRegExp.isInstalled
- XRegExp.isRegExp
- XRegExp.match
- XRegExp.matchChain
- XRegExp.matchRecursive
- XRegExp.replace
- XRegExp.replaceEach
- XRegExp.split
- XRegExp.tag
- XRegExp.test
- XRegExp.uninstall
- XRegExp.union
- XRegExp.version
XRegExp instance properties
API
XRegExp(pattern, [flags])
Creates an extended regular expression object for matching text with a pattern. Differs from a
native regular expression in that additional syntax and flags are supported. The returned object
is in fact a native RegExp
and works with all native methods.
Parameters: |
|
---|---|
Returns: |
{RegExp }Extended regular expression object. |
Example
// With named capture and flag x XRegExp(`(?<year> [0-9]{4} ) [-\\s]? # year (?<month> [0-9]{2} ) [-\\s]? # month (?<day> [0-9]{2} ) # day`, 'x'); // Providing a regex object copies it. Native regexes are recompiled using native (not // XRegExp) syntax. Copies maintain extended data, are augmented with `XRegExp.prototype` // properties, and have fresh `lastIndex` properties (set to zero). XRegExp(/regex/);
For details about the regular expression just shown, see Syntax: Named capture and Flags: Free-spacing.
Regexes, strings, and backslashes
JavaScript string literals (as opposed to, e.g., user input or text extracted from the DOM) use a backslash as an escape character. The string literal '\\'
therefore contains a single backslash, and its length
property's value is 1
. However, a backslash is also an escape character in regular expression syntax, where the pattern \\
matches a single backslash. When providing string literals to the RegExp
or XRegExp
constructor functions, four backslashes are therefore needed to match a single backslash—e.g., XRegExp('\\\\')
. Only two of those backslashes are actually passed into the constructor function. The other two are used to escape the backslashes in the string before the function ever sees the string. The exception is when using ES6 raw strings via String.raw
or XRegExp.tag
.
The same issue is at play with the \\s
sequences in the example code just shown. XRegExp
is provided with the two characters \s
, which it in turn recognizes as the metasequence used to match a whitespace character.
XRegExp.addToken(regex, handler, [options])
Extends XRegExp syntax and allows custom flags. This is used internally and can be used to create XRegExp addons. If more than one token can match the same string, the last added wins.
Parameters: |
|
---|---|
Returns: |
{undefined }Does not return a value. |
Example
// Basic usage: Add \a for the ALERT control code XRegExp.addToken( /\\a/, () => '\\x07', {scope: 'all'} ); XRegExp('\\a[\\a-\\n]+').test('\x07\n\x07'); // -> true
Show more XRegExp.addToken
examples. ↓
XRegExp.build(pattern, subs, [flags])
Requires the XRegExp.build addon, which is bundled in xregexp-all.js
.
Builds regexes using named subpatterns, for readability and pattern reuse. Backreferences in the
outer pattern and provided subpatterns are automatically renumbered to work correctly. Native
flags used by provided subpatterns are ignored in favor of the flags
argument.
Parameters: |
|
---|---|
Returns: |
|
Example
const time = XRegExp.build('(?x)^ {{hours}} ({{minutes}}) $', { hours: XRegExp.build('{{h12}} : | {{h24}}', { h12: /1[0-2]|0?[1-9]/, h24: /2[0-3]|[01][0-9]/ }, 'x'), minutes: /^[0-5][0-9]$/ }); time.test('10:59'); // -> true XRegExp.exec('10:59', time).groups.minutes; // -> '59'
See also: Creating Grammatical Regexes Using XRegExp.build.
XRegExp.cache(pattern, [flags])
Caches and returns the result of calling XRegExp(pattern, flags)
. On any subsequent call with
the same pattern and flag combination, the cached copy of the regex is returned.
Parameters: |
|
---|---|
Returns: |
{RegExp }Cached XRegExp object. |
Example
let match; while (match = XRegExp.cache('.', 'gs').exec('abc')) { // The regex is compiled once only } const regex1 = XRegExp.cache('.', 's'), const regex2 = XRegExp.cache('.', 's'); // regex1 and regex2 are references to the same regex object
XRegExp.escape(str)
Escapes any regular expression metacharacters, for use when matching literal strings. The result can safely be used at any position within a regex that uses any flags.
The escaped characters are [
, ]
, {
, }
, (
, )
, -
, *
, +
, ?
, .
, \
, ^
, $
, |
, ,
, #
, and whitespace (see free-spacing for the list of whitespace characters).
Parameters: |
|
---|---|
Returns: |
{String }String with regex metacharacters escaped. |
Example
XRegExp.escape('Escaped? <.>'); // -> 'Escaped\?\u0020<\.>'
XRegExp.exec(str, regex, [pos], [sticky])
Executes a regex search in a specified string. Returns a match array or null
. If the provided
regex uses named capture, named capture properties are included on the match array's groups
property. Optional pos
and sticky
arguments specify the search start position, and whether
the match must start at the specified position only. The lastIndex
property of the provided
regex is not used, but is updated for compatibility. Also fixes browser bugs compared to the
native RegExp.prototype.exec
and can be used reliably cross-browser.
Parameters: |
|
---|---|
Returns: |
{Array }Match array with named capture properties on the groups object, or null . If the namespacing feature is off, named capture properties are directly on the match array.
|
Example
// Basic use, with named backreference let match = XRegExp.exec('U+2620', XRegExp('U\\+(?[0-9A-F]{4})')); match.groups.hex; // -> '2620' // With pos and sticky, in a loop let pos = 2, result = [], match; while (match = XRegExp.exec('<1><2><3><4>5<6>', /<(\d)>/, pos, 'sticky')) { result.push(match[1]); pos = match.index + match[0].length; } // result -> ['2', '3', '4']
XRegExp.forEach(str, regex, callback)
Executes a provided function once per regex match. Searches always start at the beginning of the string and continue until the end, regardless of the state of the regex's global
property and initial lastIndex
.
Parameters: |
|
---|---|
Returns: |
{undefined }Does not return a value. |
Example
// Extracts every other digit from a string const evens = []; XRegExp.forEach('1a2345', /\d/, function (match, i) { if (i % 2) evens.push(+match[0]); }); // evens -> [2, 4]
XRegExp.globalize(regex)
Copies a regex object and adds flag g
. The copy maintains extended data,
is augmented with XRegExp.prototype
properties, and has a fresh lastIndex
property (set to
zero). Native regexes are not recompiled using XRegExp syntax.
Parameters: |
|
---|---|
Returns: |
{RegExp }Copy of the provided regex with flag g added.
|
Example
const globalCopy = XRegExp.globalize(/regex/); globalCopy.global; // -> true function parse(str, regex) { regex = XRegExp.globalize(regex); let match; while (match = regex.exec(str)) { // ... } }
XRegExp.install(options)
Installs optional features according to the specified options. Can be undone using XRegExp.uninstall
.
Parameters: |
|
---|---|
Returns: |
{undefined }Does not return a value. |
Example
// With an options object XRegExp.install({ // Enables support for astral code points in Unicode addons (implicitly sets flag A) astral: true, // Adds named capture groups to the `groups` property of matches // On by default in XRegExp 5 namespacing: true }); // With an options string XRegExp.install('astral namespacing');
XRegExp.isInstalled(feature)
Checks whether an individual optional feature is installed.
Parameters: |
|
---|---|
Returns: |
{Boolean }Whether the feature is installed. |
Example
XRegExp.isInstalled('astral');
XRegExp.isRegExp(value)
Returns true
if an object is a regex; false
if it isn't. This works correctly for regexes
created in another frame, when instanceof
and constructor
checks would fail.
Parameters: |
|
---|---|
Returns: |
{Boolean }Whether the object is a RegExp object.
|
Example
XRegExp.isRegExp('string'); // -> false XRegExp.isRegExp(/regex/i); // -> true XRegExp.isRegExp(RegExp('^', 'm')); // -> true XRegExp.isRegExp(XRegExp('(?s).')); // -> true
XRegExp.match(str, regex, [scope])
Returns the first matched string, or in global mode, an array containing all matched strings.
This is essentially a more convenient re-implementation of String.prototype.match
that gives
the result types you actually want (string instead of exec
-style array in match-first mode,
and an empty array instead of null
when no matches are found in match-all mode). It also lets
you override flag g
and ignore lastIndex
, and fixes browser bugs.
Parameters: |
|
---|---|
Returns: |
{String |Array }In match-first mode: First match as a string, or null . In match-all
mode: Array of all matched strings, or an empty array.
|
Example
// Match first XRegExp.match('abc', /\w/); // -> 'a' XRegExp.match('abc', /\w/g, 'one'); // -> 'a' XRegExp.match('abc', /x/g, 'one'); // -> null // Match all XRegExp.match('abc', /\w/g); // -> ['a', 'b', 'c'] XRegExp.match('abc', /\w/, 'all'); // -> ['a', 'b', 'c'] XRegExp.match('abc', /x/, 'all'); // -> []
XRegExp.matchChain(str, chain)
Retrieves the matches from searching a string using a chain of regexes that successively search
within previous matches. The provided chain
array can contain regexes and or objects with regex
and backref
properties. When a backreference is specified, the named or numbered backreference
is passed forward to the next regex or returned.
Parameters: |
|
---|---|
Returns: |
{Array }Matches by the last regex in the chain, or an empty array. |
Example
// Basic usage; matches numbers within <b> tags XRegExp.matchChain('1 <b>2</b> 3 <b>4 a 56</b>', [ XRegExp('(?is)<b>.*?</b>'), /\d+/ ]); // -> ['2', '4', '56'] // Passing forward and returning specific backreferences const html = `<a href="https://xregexp.com/api/">XRegExp</a> <a href="https://www.google.com/">Google</a>`; XRegExp.matchChain(html, [ {regex: /<a href="([^"]+)">/i, backref: 1}, {regex: XRegExp('(?i)^https?://(?<domain>[^/?#]+)'), backref: 'domain'} ]); // -> ['xregexp.com', 'www.google.com']
XRegExp.matchRecursive(str, left, right, [flags], [options])
Requires the XRegExp.matchRecursive addon, which is bundled in xregexp-all.js
.
Returns an array of match strings between outermost left and right delimiters, or an array of objects with detailed match parts and position data. By default, an error is thrown if delimiters are unbalanced within the subject string.
Parameters: |
|
---|---|
Returns: |
|
Example
// Basic usage const str1 = '(t((e))s)t()(ing)'; XRegExp.matchRecursive(str1, '\\(', '\\)', 'g'); // -> ['t((e))s', '', 'ing'] // Extended information mode with valueNames const str2 = 'Here is <div> <div>an</div></div> example'; XRegExp.matchRecursive(str2, '<div\\s*>', '</div>', 'gi', { valueNames: ['between', 'left', 'match', 'right'] }); /* -> [ {name: 'between', value: 'Here is ', start: 0, end: 8}, {name: 'left', value: '<div>', start: 8, end: 13}, {name: 'match', value: ' <div>an</div>', start: 13, end: 27}, {name: 'right', value: '</div>', start: 27, end: 33}, {name: 'between', value: ' example', start: 33, end: 41} ] */ // Omitting unneeded parts with null valueNames, and using escapeChar const str3 = '...{1}.\\{{function(x,y){return {y:x}}}'; XRegExp.matchRecursive(str3, '{', '}', 'g', { valueNames: ['literal', null, 'value', null], escapeChar: '\\' }); /* -> [ {name: 'literal', value: '...', start: 0, end: 3}, {name: 'value', value: '1', start: 4, end: 5}, {name: 'literal', value: '.\\{', start: 6, end: 9}, {name: 'value', value: 'function(x,y){return {y:x}}', start: 10, end: 37} ] */ // Sticky mode via flag y const str4 = '<1><<<2>>><3>4<5>'; XRegExp.matchRecursive(str4, '<', '>', 'gy'); // -> ['1', '<<2>>', '3'] // Skipping unbalanced delimiters instead of erroring const str5 = 'Here is <div> <div>an</div> unbalanced example'; XRegExp.matchRecursive(str5, '<div\\s*>', '</div>', 'gi', { unbalanced: 'skip' }); // -> ['an']
XRegExp.replace(str, search, replacement, [scope])
Returns a new string with one or all matches of a pattern replaced. The pattern can be a string
or regex, and the replacement can be a string or a function to be called for each match. To
perform a global search and replace, use the optional scope
argument or include flag g
if
using a regex. Replacement strings can use $<n>
or ${n}
for named and numbered backreferences.
Replacement functions can use named backreferences via the last argument. Also fixes browser
bugs compared to the native String.prototype.replace
and can be used reliably cross-browser.
For the full details of XRegExp's replacement text syntax, see Syntax: Replacement text.
Parameters: |
|
---|---|
Returns: |
{String }New string with one or all matches replaced. |
Example
// Regex search, using named backreferences in replacement string const name = XRegExp('(?<first>\\w+) (?<last>\\w+)'); XRegExp.replace('John Smith', name, '$<last>, $<first>'); // -> 'Smith, John' // Regex search, using named backreferences in replacement function XRegExp.replace('John Smith', name, (...args) => { const groups = args[args.length - 1]; return `${groups.last}, ${groups.first}`; }); // -> 'Smith, John' // String search, with replace-all XRegExp.replace('RegExp builds RegExps', 'RegExp', 'XRegExp', 'all'); // -> 'XRegExp builds XRegExps'
XRegExp.replaceEach(str, replacements)
Performs batch processing of string replacements. Used like XRegExp.replace
, but
accepts an array of replacement details. Later replacements operate on the output of earlier
replacements. Replacement details are accepted as an array with a regex or string to search for,
the replacement string or function, and an optional scope of 'one'
or 'all'
. Uses the XRegExp
replacement text syntax, which supports named backreference properties via $<name>
or ${name}
.
Parameters: |
|
---|---|
Returns: |
{String }New string with all replacements. |
Example
str = XRegExp.replaceEach(str, [ [XRegExp('(?<name>a)'), 'z$<name>'], [/b/gi, 'y'], [/c/g, 'x', 'one'], // scope 'one' overrides /g [/d/, 'w', 'all'], // scope 'all' overrides lack of /g ['e', 'v', 'all'], // scope 'all' allows replace-all for strings [/f/g, (match) => match.toUpperCase()] ]);
XRegExp.split(str, separator, [limit])
Splits a string into an array of strings using a regex or string separator. Matches of the
separator are not included in the result array. However, if separator
is a regex that contains
capturing groups, backreferences are spliced into the result each time separator
is matched.
Fixes browser bugs compared to the native String.prototype.split
and can be used reliably
cross-browser.
Parameters: |
|
---|---|
Returns: |
{Array }Array of substrings. |
Example
// Basic use XRegExp.split('a b c', ' '); // -> ['a', 'b', 'c'] // With limit XRegExp.split('a b c', ' ', 2); // -> ['a', 'b'] // Backreferences in result array XRegExp.split('..word1..', /([a-z]+)(\d+)/i); // -> ['..', 'word', '1', '..']
XRegExp.tag([flags])`pattern`
Requires the XRegExp.build addon, which is bundled in xregexp-all.js
.
Provides tagged template literals that create regexes with XRegExp syntax and flags. The provided pattern is handled as a raw string, so backslashes don't need to be escaped.
Interpolation of strings and regexes shares the features of XRegExp.build
. Interpolated
patterns are treated as atomic units when quantified, interpolated strings have their special
characters escaped, a leading ^
and trailing unescaped $
are stripped from interpolated
regexes if both are present, and any backreferences within an interpolated regex are
rewritten to work within the overall pattern.
Parameters: |
|
---|---|
Returns: |
{RegExp }Extended regular expression object. |
Example
XRegExp.tag()`\b\w+\b`.test('word'); // -> true const hours = /1[0-2]|0?[1-9]/; const minutes = /(?<minutes>[0-5][0-9])/; const time = XRegExp.tag('x')`\b ${hours} : ${minutes} \b`; time.test('10:59'); // -> true XRegExp.exec('10:59', time).groups.minutes; // -> '59' const backref1 = /(a)\1/; const backref2 = /(b)\1/; XRegExp.tag()`${backref1}${backref2}`.test('aabb'); // -> true
XRegExp.test(str, regex, [pos], [sticky])
Executes a regex search in a specified string. Returns true
or false
. Optional pos
and
sticky
arguments specify the search start position, and whether the match must start at the
specified position only. The lastIndex
property of the provided regex is not used, but is
updated for compatibility. Also fixes browser bugs compared to the native
RegExp.prototype.test
and can be used reliably cross-browser.
Parameters: |
|
---|---|
Returns: |
{Boolean }Whether the regex matched the provided value. |
Example
// Basic use XRegExp.test('abc', /c/); // -> true // With pos and sticky XRegExp.test('abc', /c/, 0, 'sticky'); // -> false XRegExp.test('abc', /c/, 2, 'sticky'); // -> true
XRegExp.uninstall(options)
Uninstalls optional features according to the specified options. Used to undo the actions of XRegExp.install
.
Parameters: |
|
---|---|
Returns: |
{undefined }Does not return a value. |
Example
// With an options object XRegExp.uninstall({ // Disables support for astral code points in Unicode addons (unless enabled per regex) astral: true, // Don't add named capture groups to the `groups` property of matches namespacing: true }); // With an options string XRegExp.uninstall('astral namespacing');
XRegExp.union(patterns, [flags])
Returns an XRegExp object that is the union of the given patterns. Patterns can be provided as
regex objects or strings. Metacharacters are escaped in patterns provided as strings.
Backreferences in provided regex objects are automatically renumbered to work correctly within the larger combined pattern. Native
flags used by provided regexes are ignored in favor of the flags
argument.
Parameters: |
|
---|---|
Returns: |
{RegExp }Union of the provided regexes and strings. |
Example
XRegExp.union(['a+b*c', /(dogs)\1/, /(cats)\1/], 'i'); // -> /a\+b\*c|(dogs)\1|(cats)\2/i XRegExp.union([/man/, /bear/, /pig/], 'i', {conjunction: 'none'}); // -> /manbearpig/i
XRegExp.version
The XRegExp version number as a string containing three dot-separated parts. For example, '2.0.0-beta-3'
.
<regexp>.xregexp.source
The original pattern provided to the XRegExp
constructor. Note that this differs from the <regexp>.source
property which holds the transpiled source in native RegExp
syntax and therefore can't be used to reconstruct the regex (e.g. <regexp>.source
holds no knowledge of capture names). This property is available only for regexes originally constructed by XRegExp
. It is null
for native regexes copied using the XRegExp
constructor or XRegExp.globalize
.
<regexp>.xregexp.flags
The original flags provided to the XRegExp
constructor. Differs from the ES6 <regexp>.flags
property in that it includes XRegExp's non-native flags and is accessible even in pre-ES6 browsers. This property is available only for regexes originally constructed by XRegExp
. It is null
for native regexes copied using the XRegExp
constructor or XRegExp.globalize
. When regexes originally constructed by XRegExp
are copied using XRegExp.globalize
, the value of this property is augmented with 'g'
if not already present. Flags are listed in alphabetical order.