REGEXTRACT
Syntax
REGEXTRACT(<string to be examined>; <regular expression>)
The regular expression can't reference a workbook column.
Description
Extracts tokens that match a regular expression. When using the REGEXTRACT function on a column from the same worksheet, the records from the source column are removed to match the function results.
There is a difference in expressions between the Formula Builder and the Formula Bar.
With the Formula Builder, if you are writing regular expressions, you can use normal syntax. A backslash (\) is used to separate each expression.
With the Formula Bar, if you are writing regular expressions you need to include an extra backslash (\) between each expression. This extra backslash between expressions is due to Datameer X using the backslash as an escape character.
Examples
Column1 | Regular expression | REGEXTRACT returns |
---|---|---|
hello world is out | \\w*o\\w* | hello, world, out |
hello world is out | \\w*1\\w* | hello, world |
Twitter example: Extracting #hashtags and @mentions
Column1 | Regular expression | REGEXTRACT returns |
---|---|---|
Hey, @Datameer X I love how you are so #awesome | \\@\\w+ | @Datameer |
Hey, @Datameer X I love how you are so #awesome | \\#\\w+ | #awesome |
This function differs from the TOKENIZE function, which returns tokens by separators that match the regular expression.