|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.solr.analysis.CommonGramsFilter
public final class CommonGramsFilter
Construct bigrams for frequently occurring terms while indexing. Single terms
are still indexed too, with bigrams overlaid. This is achieved through the
use of PositionIncrementAttribute.setPositionIncrement(int)
. Bigrams have a type
of GRAM_TYPE
Example:
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
---|
AttributeSource.AttributeFactory, AttributeSource.State |
Field Summary |
---|
Fields inherited from class org.apache.lucene.analysis.TokenFilter |
---|
input |
Constructor Summary | |
---|---|
CommonGramsFilter(TokenStream input,
Set<?> commonWords)
Deprecated. Use CommonGramsFilter(Version, TokenStream, Set) instead |
|
CommonGramsFilter(TokenStream input,
Set<?> commonWords,
boolean ignoreCase)
Deprecated. Use CommonGramsFilter(Version, TokenStream, Set) instead |
|
CommonGramsFilter(TokenStream input,
String[] commonWords)
Deprecated. Use CommonGramsFilter(Version, TokenStream, Set) instead. |
|
CommonGramsFilter(TokenStream input,
String[] commonWords,
boolean ignoreCase)
Deprecated. Use CommonGramsFilter(Version, TokenStream, Set, boolean) instead. |
|
CommonGramsFilter(Version matchVersion,
TokenStream input,
Set<?> commonWords)
Construct a token stream filtering the given input using a Set of common words to create bigrams. |
|
CommonGramsFilter(Version matchVersion,
TokenStream input,
Set<?> commonWords,
boolean ignoreCase)
Deprecated. Use CommonGramsFilter(Version, TokenStream, Set) instead |
Method Summary | |
---|---|
boolean |
incrementToken()
Inserts bigrams for common words into a token stream. |
static CharArraySet |
makeCommonSet(String[] commonWords)
Deprecated. create a CharArraySet with CharArraySet instead |
static CharArraySet |
makeCommonSet(String[] commonWords,
boolean ignoreCase)
Deprecated. create a CharArraySet with CharArraySet instead |
void |
reset()
|
Methods inherited from class org.apache.lucene.analysis.TokenFilter |
---|
close, end |
Methods inherited from class org.apache.lucene.util.AttributeSource |
---|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
@Deprecated public CommonGramsFilter(TokenStream input, Set<?> commonWords)
CommonGramsFilter(Version, TokenStream, Set)
instead
@Deprecated public CommonGramsFilter(TokenStream input, Set<?> commonWords, boolean ignoreCase)
CommonGramsFilter(Version, TokenStream, Set)
instead
public CommonGramsFilter(Version matchVersion, TokenStream input, Set<?> commonWords)
input
- TokenStream input in filter chaincommonWords
- The set of common words.@Deprecated public CommonGramsFilter(Version matchVersion, TokenStream input, Set<?> commonWords, boolean ignoreCase)
CommonGramsFilter(Version, TokenStream, Set)
instead
commonWords
is an instance of
CharArraySet
(true if makeCommonSet()
was used to
construct the set) it will be directly used and ignoreCase
will be ignored since CharArraySet
directly controls case
sensitivity.
If commonWords
is not an instance of CharArraySet
, a
new CharArraySet will be constructed and ignoreCase
will be
used to specify the case sensitivity of that set.
input
- TokenStream input in filter chain.commonWords
- The set of common words.ignoreCase
- -Ignore case when constructing bigrams for common words.@Deprecated public CommonGramsFilter(TokenStream input, String[] commonWords)
CommonGramsFilter(Version, TokenStream, Set)
instead.
input
- Tokenstream in filter chaincommonWords
- words to be used in constructing bigrams@Deprecated public CommonGramsFilter(TokenStream input, String[] commonWords, boolean ignoreCase)
CommonGramsFilter(Version, TokenStream, Set, boolean)
instead.
input
- Tokenstream in filter chaincommonWords
- words to be used in constructing bigramsignoreCase
- -Ignore case when constructing bigrams for common words.Method Detail |
---|
@Deprecated public static CharArraySet makeCommonSet(String[] commonWords)
commonWords
- Array of common words which will be converted into the CharArraySet
passing false to ignoreCase
@Deprecated public static CharArraySet makeCommonSet(String[] commonWords, boolean ignoreCase)
commonWords
- Array of common words which will be converted into the CharArraySetignoreCase
- If true, all words are lower cased first.
public boolean incrementToken() throws IOException
incrementToken
in class TokenStream
IOException
public void reset() throws IOException
reset
in class TokenFilter
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |