TOKENS()
The TOKENS()
function is the only function that you can use freely in the query without a SEARCH
statement. A wrapping ANALYZER()
call in a search expression does not affect the analyzer
argument nor allow you to omit it.
Syntax
TOKENS(input, analyzer) → tokenArray
Split the input
string with the help of the specified analyzer
into an array. You can use the resulting array in FILTER
or SEARCH
statements with the IN
operator.
Key | Type | Description |
---|---|---|
input | string | Text to tokenize. |
analyzer | string | Name of an analyzer. |
Example 1
Example query showcasing the "text_de"
analyzer, which features tokenization with stemming, case conversion, and accent removal for German text:
RETURN TOKENS("Lörem ipsüm, DOLOR SIT Ämet.", "text_de")
[
[
"lor",
"ipsum",
"dolor",
"sit",
"amet"
]
]
Example 2
This example searches for documents where the text
attribute contains certain tokens in any order:
FOR doc IN viewName
SEARCH ANALYZER(doc.text IN TOKENS("dolor amet lorem", "text_en"), "text_en")
RETURN doc
Alternatively, if you want to search for tokens in a particular order, use PHRASE() instead.
Example 3
When calling a TOKENS()
function, you must always specify the analyzer name:
FOR doc IN viewName
SEARCH ANALYZER(doc.text IN TOKENS("foo", "text_en"), "text_en")
RETURN doc