The WhiteSpaceAnalyzer recognizes tokens as maximal strings of non-whitespace characters. If implemented in Ruby the WhiteSpaceAnalyzer would look like;
class WhiteSpaceAnalyzer def initialize(lower = true) @lower = lower end def token_stream(field, str) return WhiteSpaceTokenizer.new(str, @lower) end end
As you can see it makes use of the WhiteSpaceTokenizer.
Create a new WhiteSpaceAnalyzer which downcases tokens by default but can optionally leave case as is. Lowercasing will be done based on the current locale.
set to false if you don't want the field's tokens to be downcased
static VALUE frb_white_space_analyzer_init(int argc, VALUE *argv, VALUE self) { Analyzer *a; GET_LOWER(false); #ifndef POSH_OS_WIN32 if (!frb_locale) frb_locale = setlocale(LC_CTYPE, ""); #endif a = mb_whitespace_analyzer_new(lower); Frt_Wrap_Struct(self, NULL, &frb_analyzer_free, a); object_add(a, self); return self; }