|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.solr.analysis.BaseTokenFilterFactory
org.apache.solr.analysis.HyphenationCompoundWordTokenFilterFactory
public class HyphenationCompoundWordTokenFilterFactory
Factory for HyphenationCompoundWordTokenFilter
.
This factory accepts the following parameters:
hyphenator
(mandatory): path to the FOP xml hyphenation pattern.
See http://offo.sourceforge.net/hyphenation/.
encoding
(optional): encoding of the xml hyphenation file. defaults to UTF-8.
dictionary
(optional): dictionary of words. defaults to no dictionary.
minWordSize
(optional): minimal word length that gets decomposed. defaults to 5.
minSubwordSize
(optional): minimum length of subwords. defaults to 2.
maxSubwordSize
(optional): maximum length of subwords. defaults to 15.
onlyLongestMatch
(optional): if true, adds only the longest matching subword
to the stream. defaults to false.
<fieldType name="text_hyphncomp" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.HyphenationCompoundWordTokenFilterFactory" hyphenator="hyphenator.xml" encoding="UTF-8" dictionary="dictionary.txt" minWordSize="5" minSubwordSize="2" maxSubwordSize="15" onlyLongestMatch="false"/> </analyzer> </fieldType>
HyphenationCompoundWordTokenFilter
Field Summary | |
---|---|
protected Map<String,String> |
args
The init args |
protected org.apache.lucene.util.Version |
luceneMatchVersion
the luceneVersion arg |
Fields inherited from class org.apache.solr.analysis.BaseTokenFilterFactory |
---|
log |
Constructor Summary | |
---|---|
HyphenationCompoundWordTokenFilterFactory()
|
Method Summary | |
---|---|
protected void |
assureMatchVersion()
this method can be called in the TokenizerFactory.create(java.io.Reader)
or TokenFilterFactory.create(org.apache.lucene.analysis.TokenStream) methods,
to inform user, that for this factory a luceneMatchVersion is required |
org.apache.lucene.analysis.compound.HyphenationCompoundWordTokenFilter |
create(org.apache.lucene.analysis.TokenStream input)
Transform the specified input TokenStream |
Map<String,String> |
getArgs()
|
protected boolean |
getBoolean(String name,
boolean defaultVal)
|
protected boolean |
getBoolean(String name,
boolean defaultVal,
boolean useDefault)
|
protected int |
getInt(String name)
|
protected int |
getInt(String name,
int defaultVal)
|
protected int |
getInt(String name,
int defaultVal,
boolean useDefault)
|
protected org.apache.lucene.analysis.CharArraySet |
getWordSet(ResourceLoader loader,
String wordFiles,
boolean ignoreCase)
|
void |
inform(ResourceLoader loader)
|
void |
init(Map<String,String> args)
init will be called just once, immediately after creation. |
protected void |
warnDeprecated(String message)
|
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface org.apache.solr.analysis.TokenFilterFactory |
---|
getArgs |
Field Detail |
---|
protected Map<String,String> args
protected org.apache.lucene.util.Version luceneMatchVersion
Constructor Detail |
---|
public HyphenationCompoundWordTokenFilterFactory()
Method Detail |
---|
public void init(Map<String,String> args)
TokenFilterFactory
init
will be called just once, immediately after creation.
The args are user-level initialization parameters that may be specified when declaring the factory in the schema.xml
init
in interface TokenFilterFactory
public void inform(ResourceLoader loader)
inform
in interface ResourceLoaderAware
public org.apache.lucene.analysis.compound.HyphenationCompoundWordTokenFilter create(org.apache.lucene.analysis.TokenStream input)
TokenFilterFactory
create
in interface TokenFilterFactory
public Map<String,String> getArgs()
protected final void assureMatchVersion()
TokenizerFactory.create(java.io.Reader)
or TokenFilterFactory.create(org.apache.lucene.analysis.TokenStream)
methods,
to inform user, that for this factory a luceneMatchVersion
is required
protected final void warnDeprecated(String message)
protected int getInt(String name)
protected int getInt(String name, int defaultVal)
protected int getInt(String name, int defaultVal, boolean useDefault)
protected boolean getBoolean(String name, boolean defaultVal)
protected boolean getBoolean(String name, boolean defaultVal, boolean useDefault)
protected org.apache.lucene.analysis.CharArraySet getWordSet(ResourceLoader loader, String wordFiles, boolean ignoreCase) throws IOException
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |