Package translate :: Package lang :: Module km :: Class km
[hide private]
[frames] | no frames]

Class km

source code


This class represents Khmer.

Instance Methods [hide private]

Inherited from common.Common: __deepcopy__, __repr__, alter_length, length_difference

Inherited from object: __delattr__, __getattribute__, __hash__, __init__, __reduce__, __reduce_ex__, __setattr__, __str__

Class Methods [hide private]

Inherited from common.Common: capsstart, character_iter, characters, punctranslate, sentence_iter, sentences, word_iter, words

Static Methods [hide private]

Inherited from common.Common: __new__

Class Variables [hide private]
  khmerpunc = u'។៕៖៘'
These marks are only used for Khmer.
  punctuation = u'.,;:!?-@#$%^*_()[]{}/\'`"<>‘’‛“”„‟′″‴‵‶‷‹›«»…±...
We include many types of punctuation here, simply since this is only meant to determine if something is punctuation.
  sentenceend = u'!?…។៕៘'
These marks can indicate a sentence end.
  sentencere = re.compile(r'(?sx).*?[!\?\u2026\u17d4\u17d5\u17d8...
  puncdict = {u'!': u' !', u'.': u' ។', u':': u' ៖', u'?': u' ?'}
A dictionary of punctuation transformation rules that can be used by punctranslate().
  ignoretests = ['startcaps', 'simplecaps']
List of pofilter tests for this language that must be ignored.

Inherited from common.Common: CJKpunc, checker, code, commonpunc, ethiopicpunc, fullname, indicpunc, invertedpunc, listseperator, miscpunc, nplurals, pluralequation, quotes, rtlpunc, validaccel, validdoublewords

Inherited from common.Common (private): _languages

Properties [hide private]

Inherited from object: __class__

Class Variable Details [hide private]

punctuation

We include many types of punctuation here, simply since this is only meant to determine if something is punctuation. Hopefully we catch some languages which might not be represented with modules. Most languages won't need to override this.

Value:
u'.,;:!?-@#$%^*_()[]{}/\'`"<>‘’‛“”„‟′″‴‵‶‷‹›«»…±°¹²³·©®×£¥€។៕៖៘'

sentencere

Value:
re.compile(r'(?sx).*?[!\?\u2026\u17d4\u17d5\u17d8]\s+(?=[^a-z\d])')