Package translate :: Package lang :: Module team
[hide private]
[frames] | no frames]

Module team

source code

Module to guess the language ISO code based on the 'Language-Team entry in the header of a Gettext PO file.

Functions [hide private]
 
_regex_guesser(prefilter, regex, string, postfilter=None)
Use regular expressions to extract the language team
source code
 
_nofilter(text)
Return the supplied text unchanged
source code
 
_lower(text)
Convert the supplied text to lowercase
source code
 
_snippet_guesser(snippets_dict, string, filter_=<function _nofilter at 0x94e6a04>)
Guess the language based on a snippet of text in the language team string.
source code
 
guess_language(team_string)
Gueses the language of a PO file based on the Language-Team entry
source code
Variables [hide private]
  LANG_TEAM_REGEX = (('@li.org', '([a-z_A-Z]{2,})@li.org', ['LL'...
Data for regular expression based extraction.
  LANG_TEAM_CONTACT_SNIPPETS = {'af': ('i18n@af.org.za', 'Petri ...
Language codes with snippets of contact information that can be used to uniquely identify the language
  LANG_TEAM_LANGUAGE_SNIPPETS = {'af': ('Afrikaans'), 'am': ('Am...
Language codes with snippets of language names, including English, native spelling and varients, that can be used to uniquely identify the language

Imports: re, accepts, returns, IsOneOf, String


Function Details [hide private]

_regex_guesser(prefilter, regex, string, postfilter=None)

source code 

Use regular expressions to extract the language team

Parameters:
  • prefilter - simple filter to apply before attempting the regex
  • regex - regular expression with one group that will contain the language code
  • string - the language team string that should be examined
  • postfilter - filter to apply to reject any potential matches after they have been retreived by the regex
Returns:
ISO language code for the found language

_snippet_guesser(snippets_dict, string, filter_=<function _nofilter at 0x94e6a04>)

source code 

Guess the language based on a snippet of text in the language team string.

Parameters:
  • snippets_dict - A dict of snippets that can be used to identify a language in the format {'lang': ('snippet1', 'snippet2'), 'lang2'...}
  • string - The language string to be analysed
  • filter_ - a function to be applied to the string and snippets before examination

guess_language(team_string)

source code 

Gueses the language of a PO file based on the Language-Team entry

Decorators:
  • @accepts(unicode)
  • @returns(IsOneOf(String, type(None)))

Variables Details [hide private]

LANG_TEAM_REGEX

Data for regular expression based extraction. The fieds are: prefilter information, regex with single group that contains the language code, postfilter.

Value:
(('@li.org', '([a-z_A-Z]{2,})@li.org', ['LL', 'XX', 'TEAM']),
 ('translation-team',
  'translation-team-([a-z_A-Z]+)@lists.sourceforge.net',
  None),
 ('fedora-trans', 'fedora-trans-([a-z_A-Z]+)@redhat.com', ['list']),
 ('ubuntu-l10n', 'ubuntu-l10n-([a-z_A-Z]+)@lists.ubuntu.com', None),
 ('translate-discuss',
  'translate-discuss-([a-z_A-Z]+)@lists.sourceforge.net',
...

LANG_TEAM_CONTACT_SNIPPETS

Language codes with snippets of contact information that can be used to uniquely identify the language

Value:
{'af': ('i18n@af.org.za', 'Petri Jooste'),
 'am': ('@geez.org'),
 'ar': ('arabeyes.org', 'Arabeyes'),
 'as': ('assam@mm.assam-glug.org'),
 'ast': ('@softastur.org',
         'launchpad.net/~ubuntu-l10n-ast',
         'softast-xeneral@lists.sourceforge.net',
         'Softastur'),
...

LANG_TEAM_LANGUAGE_SNIPPETS

Language codes with snippets of language names, including English, native spelling and varients, that can be used to uniquely identify the language

Value:
{'af': ('Afrikaans'),
 'am': ('Amharic'),
 'ang': ('Old English'),
 'ar': ('Arabic'),
 'as': ('Assamese'),
 'ast': ('Asturian'),
 'az': ('Azerbaijani', u'Azərbaycan'),
 'be': ('Belarusian', 'Belorussian'),
...