org.apache.commons.codec.language.bm
Class Rule

java.lang.Object
  extended by org.apache.commons.codec.language.bm.Rule

public class Rule
extends Object

A phoneme rule.

Rules have a pattern, left context, right context, output phoneme, set of languages for which they apply and a logical flag indicating if all languages must be in play. A rule matches if:

Rules are typically generated by parsing rules resources. In normal use, there will be no need for the user to explicitly construct their own.

Rules are immutable and thread-safe.

Rules resources

Rules are typically loaded from resource files. These are UTF-8 encoded text files. They are systematically named following the pattern:

org/apache/commons/codec/language/bm/${NameType#getName}_${RuleType#getName}_${language}.txt

The format of these resources is the following:

Since:
1.6
Version:
$Id: Rule.java 1435550 2013-01-19 14:09:52Z tn $

Nested Class Summary
static class Rule.Phoneme
           
static interface Rule.PhonemeExpr
           
static class Rule.PhonemeList
           
static interface Rule.RPattern
          A minimal wrapper around the functionality of Pattern that we use, to allow for alternate implementations.
 
Field Summary
static String ALL
           
static Rule.RPattern ALL_STRINGS_RMATCHER
           
private static String DOUBLE_QUOTE
           
private static String HASH_INCLUDE
           
private  Rule.RPattern lContext
           
private  String pattern
           
private  Rule.PhonemeExpr phoneme
           
private  Rule.RPattern rContext
           
private static Map<NameType,Map<RuleType,Map<String,List<Rule>>>> RULES
           
 
Constructor Summary
Rule(String pattern, String lContext, String rContext, Rule.PhonemeExpr phoneme)
          Creates a new rule.
 
Method Summary
private static boolean contains(CharSequence chars, char input)
           
private static String createResourceName(NameType nameType, RuleType rt, String lang)
           
private static Scanner createScanner(NameType nameType, RuleType rt, String lang)
           
private static Scanner createScanner(String lang)
           
private static boolean endsWith(CharSequence input, CharSequence suffix)
           
static List<Rule> getInstance(NameType nameType, RuleType rt, Languages.LanguageSet langs)
          Gets rules for a combination of name type, rule type and languages.
static List<Rule> getInstance(NameType nameType, RuleType rt, String lang)
          Gets rules for a combination of name type, rule type and a single language.
 Rule.RPattern getLContext()
          Gets the left context.
 String getPattern()
          Gets the pattern.
 Rule.PhonemeExpr getPhoneme()
          Gets the phoneme.
 Rule.RPattern getRContext()
          Gets the right context.
private static Rule.Phoneme parsePhoneme(String ph)
           
private static Rule.PhonemeExpr parsePhonemeExpr(String ph)
           
private static List<Rule> parseRules(Scanner scanner, String location)
           
private static Rule.RPattern pattern(String regex)
          Attempts to compile the regex into direct string ops, falling back to Pattern and Matcher in the worst case.
 boolean patternAndContextMatches(CharSequence input, int i)
          Decides if the pattern and context match the input starting at a position.
private static boolean startsWith(CharSequence input, CharSequence prefix)
           
private static String stripQuotes(String str)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

ALL_STRINGS_RMATCHER

public static final Rule.RPattern ALL_STRINGS_RMATCHER

ALL

public static final String ALL
See Also:
Constant Field Values

DOUBLE_QUOTE

private static final String DOUBLE_QUOTE
See Also:
Constant Field Values

HASH_INCLUDE

private static final String HASH_INCLUDE
See Also:
Constant Field Values

RULES

private static final Map<NameType,Map<RuleType,Map<String,List<Rule>>>> RULES

lContext

private final Rule.RPattern lContext

pattern

private final String pattern

phoneme

private final Rule.PhonemeExpr phoneme

rContext

private final Rule.RPattern rContext
Constructor Detail

Rule

public Rule(String pattern,
            String lContext,
            String rContext,
            Rule.PhonemeExpr phoneme)
Creates a new rule.

Parameters:
pattern - the pattern
lContext - the left context
rContext - the right context
phoneme - the resulting phoneme
Method Detail

contains

private static boolean contains(CharSequence chars,
                                char input)

createResourceName

private static String createResourceName(NameType nameType,
                                         RuleType rt,
                                         String lang)

createScanner

private static Scanner createScanner(NameType nameType,
                                     RuleType rt,
                                     String lang)

createScanner

private static Scanner createScanner(String lang)

endsWith

private static boolean endsWith(CharSequence input,
                                CharSequence suffix)

getInstance

public static List<Rule> getInstance(NameType nameType,
                                     RuleType rt,
                                     Languages.LanguageSet langs)
Gets rules for a combination of name type, rule type and languages.

Parameters:
nameType - the NameType to consider
rt - the RuleType to consider
langs - the set of languages to consider
Returns:
a list of Rules that apply

getInstance

public static List<Rule> getInstance(NameType nameType,
                                     RuleType rt,
                                     String lang)
Gets rules for a combination of name type, rule type and a single language.

Parameters:
nameType - the NameType to consider
rt - the RuleType to consider
lang - the language to consider
Returns:
a list rules for a combination of name type, rule type and a single language.

parsePhoneme

private static Rule.Phoneme parsePhoneme(String ph)

parsePhonemeExpr

private static Rule.PhonemeExpr parsePhonemeExpr(String ph)

parseRules

private static List<Rule> parseRules(Scanner scanner,
                                     String location)

pattern

private static Rule.RPattern pattern(String regex)
Attempts to compile the regex into direct string ops, falling back to Pattern and Matcher in the worst case.

Parameters:
regex - the regular expression to compile
Returns:
an RPattern that will match this regex

startsWith

private static boolean startsWith(CharSequence input,
                                  CharSequence prefix)

stripQuotes

private static String stripQuotes(String str)

getLContext

public Rule.RPattern getLContext()
Gets the left context. This is a regular expression that must match to the left of the pattern.

Returns:
the left context Pattern

getPattern

public String getPattern()
Gets the pattern. This is a string-literal that must exactly match.

Returns:
the pattern

getPhoneme

public Rule.PhonemeExpr getPhoneme()
Gets the phoneme. If the rule matches, this is the phoneme associated with the pattern match.

Returns:
the phoneme

getRContext

public Rule.RPattern getRContext()
Gets the right context. This is a regular expression that must match to the right of the pattern.

Returns:
the right context Pattern

patternAndContextMatches

public boolean patternAndContextMatches(CharSequence input,
                                        int i)
Decides if the pattern and context match the input starting at a position. It is a match if the lContext matches input up to i, pattern matches at i and rContext matches from the end of the match of pattern to the end of input.

Parameters:
input - the input String
i - the int position within the input
Returns:
true if the pattern and left/right context match, false otherwise


commons-codec version 1.8 - Copyright © 2002-2013 - Apache Software Foundation