TOKENIZE
Syntax
TOKENIZE(<string>; <regular expression for separator>; <number>])
Description
Tokenizes text by separators that match the regular expression. TOKENIZE can also return a specific token by specifying its index as optional 3rd argument. If the token of the specified index doesn't exist (because the index is out of range), TOKENIZE returns null. Because this function is meant for text analytics, it doesn't return a token for empty strings.
If you want to extract tokens that match a regular expression you need to use the REGEXTRACT function.Â
Examples
String | Regular expression | Number | TOKENIZE returns |
---|---|---|---|
hello world | " " | null | "hello" "world" |
hello world\t2 | \\W+ | null | "hello" "world" "2" |
a.c,d | [.,] | null | "a" "c" "d" |
12-11-2006 | - | 1 | 11 |
12-11-2006 | - | 3 | null |
"a.c,,d" | "[.,]" | null | "a" "c" ",d" |
"a,,c,d" | "," | null | "a" ",c" "d" |