Module tokenizer

Source

Structsยง

Analyzer
Analyzer analyzes a text into a list of tokens.
ChineseTokenizer
ChineseTokenizer tokenizes a Chinese text.
EnglishTokenizer
EnglishTokenizer tokenizes an English text.
JIEBA ๐Ÿ”’

Constantsยง

VALID_ASCII_TOKEN ๐Ÿ”’
A-Z, a-z, 0-9, and โ€˜_โ€™ are true

Traitsยง

Tokenizer
Tokenizer tokenizes a text into a list of tokens.