An AsciiLetterAnalyzer creates a TokenStream that splits the input up into maximal strings of ASCII characters. If implemented in Ruby it would look like;
class AsciiLetterAnalyzer def initialize(lower = true) @lower = lower end def token_stream(field, str) if @lower return AsciiLowerCaseFilter.new(AsciiLetterTokenizer.new(str)) else return AsciiLetterTokenizer.new(str) end end end
As you can see it makes use of the AsciiLetterTokenizer and AsciiLowerCaseFilter. Note that this tokenizer won't recognize non-ASCII characters so you should use the LetterAnalyzer is you want to analyze multi-byte data like "UTF-8".
Create a new AsciiWhiteSpaceAnalyzer which downcases tokens by default but can optionally leave case as is. Lowercasing will only be done to ASCII characters.
lower |
set to false if you don't want the field's tokens to be downcased |
static VALUE frb_a_letter_analyzer_init(int argc, VALUE *argv, VALUE self) { Analyzer *a; GET_LOWER(true); a = letter_analyzer_new(lower); Frt_Wrap_Struct(self, NULL, &frb_analyzer_free, a); object_add(a, self); return self; }
Generated with the Darkfish Rdoc Generator 2.