Package translate :: Package storage :: Module properties
[hide private]
[frames] | no frames]

Source Code for Module translate.storage.properties

  1  #!/usr/bin/env python 
  2  # -*- coding: utf-8 -*- 
  3  # 
  4  # Copyright 2004-2006 Zuza Software Foundation 
  5  # 
  6  # This file is part of translate. 
  7  # 
  8  # translate is free software; you can redistribute it and/or modify 
  9  # it under the terms of the GNU General Public License as published by 
 10  # the Free Software Foundation; either version 2 of the License, or 
 11  # (at your option) any later version. 
 12  # 
 13  # translate is distributed in the hope that it will be useful, 
 14  # but WITHOUT ANY WARRANTY; without even the implied warranty of 
 15  # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the 
 16  # GNU General Public License for more details. 
 17  # 
 18  # You should have received a copy of the GNU General Public License 
 19  # along with translate; if not, write to the Free Software 
 20  # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA 
 21   
 22  """Classes that hold units of .properties, and similar, files that are used in 
 23     translating Java, Mozilla, MacOS and other software. 
 24   
 25     The L{propfile} class is a monolingual class with L{propunit} providing unit 
 26     level access.   
 27   
 28     The .properties store has become a general key value pair class with 
 29     L{Dialect} providing the ability to change the behaviour of the parsing 
 30     and handling of the various dialects. 
 31   
 32     Currently we support:: 
 33       * Java .properties 
 34       * Mozilla .properties 
 35       * Adobe Flex files 
 36       * MacOS X .strings files 
 37       * Skype .lang files 
 38   
 39    
 40     Dialects 
 41     ======== 
 42     The following provides references and descriptions of the various dialects supported:: 
 43   
 44     Java 
 45     ---- 
 46     Java .properties are supported completely except for the ability to drop 
 47     pairs that are not translated. 
 48   
 49     The following U{.properties file 
 50     description<http://java.sun.com/j2se/1.4.2/docs/api/java/util/Properties.html#load(java.io.InputStream)>} 
 51     and U{example <http://www.exampledepot.com/egs/java.util/Props.html>} give 
 52     some good references to the .properties specification. 
 53   
 54     Properties file may also hold Java 
 55     U{MessageFormat<http://java.sun.com/j2se/1.4.2/docs/api/java/text/MessageFormat.html>} 
 56     messages.  No special handling is provided in this storage class for 
 57     MessageFormat, but this may be implemented in future. 
 58   
 59     All delimiter types, comments, line continuations and spaces handling in 
 60     delimeters are supported. 
 61   
 62     Mozilla 
 63     ------- 
 64     Mozilla files use '=' as a delimiter, are UTF-8 encoded and thus don't need \\u 
 65     escaping.  Any \\U values will be converted to correct Unicode characters. 
 66  ` 
 67     Strings 
 68     ------- 
 69     Mac OS X strings files are implemented using  
 70     U{these<http://developer.apple.com/mac/library/documentation/MacOSX/Conceptual/BPInternational/Articles/StringsFiles.html>} 
 71     U{two<http://developer.apple.com/mac/library/documentation/Cocoa/Conceptual/LoadingResources/Strings/Strings.html>} 
 72     articles as references. 
 73   
 74     Flex 
 75     ---- 
 76     Adobe Flex files seem to be normal .properties files but in UTF-8 just like 
 77     Mozilla files. This 
 78     U{page<http://livedocs.adobe.com/flex/3/html/help.html?content=l10n_3.html>} 
 79     provides the information used to implement the dialect. 
 80   
 81     Skype 
 82     ----- 
 83     Skype .lang files seem to be UTF-16 encoded .properties files. 
 84   
 85     Implementation 
 86     ============== 
 87    
 88     A simple summary of what is permissible follows. 
 89   
 90     Comments supported:: 
 91       # a comment 
 92       ! a comment 
 93       // a comment (only at the beginning of a line) 
 94       /* a comment (not across multiple lines) */ 
 95   
 96     Name and Value pairs:: 
 97       # Delimiters 
 98       key = value 
 99       key : value 
100       key value 
101   
102       # Space in key and around value 
103       \ key\ = \ value 
104   
105       # Note that the b and c are escaped for epydoc rendering 
106       b = a string with escape sequences \\t \\n \\r \\\\ \\" \\' \\ (space) \u0123 
107       c = a string with a continuation line \\ 
108           continuation line 
109   
110       # Special cases 
111       # key with no value 
112       key 
113       # value no key (extractable in prop2po but not mergeable in po2prop) 
114       =value 
115   
116       # .strings specific 
117       "key" = "value"; 
118  """ 
119   
120  from translate.storage import base 
121  from translate.misc import quote 
122  from translate.misc.typecheck import accepts, returns, IsOneOf 
123  from translate.lang import data 
124  import re 
125  import warnings 
126   
127  # the rstripeols convert dos <-> unix nicely as well 
128  # output will be appropriate for the platform 
129   
130  eol = "\n" 
131 132 133 @accepts(unicode, [unicode]) 134 @returns(IsOneOf(type(None), unicode), int) 135 -def _find_delimiter(line, delimiters):
136 """Find the type and position of the delimiter in a property line. 137 138 Property files can be delimeted by "=", ":" or whitespace (space for now). 139 We find the position of each delimiter, then find the one that appears 140 first. 141 142 @param line: A properties line 143 @type line: str 144 @param delimiters: valid delimiters 145 @type delimiters: list 146 @return: delimiter character and offset within L{line} 147 @rtype: Tuple (delimiter char, Offset Integer) 148 """ 149 delimiter_dict = {} 150 for delimiter in delimiters: 151 delimiter_dict[delimiter] = -1 152 delimiters = delimiter_dict 153 # Find the position of each delimiter type 154 for delimiter, pos in delimiters.iteritems(): 155 prewhitespace = len(line) - len(line.lstrip()) 156 pos = line.find(delimiter, prewhitespace) 157 while pos != -1: 158 if delimiters[delimiter] == -1 and line[pos-1] != u"\\": 159 delimiters[delimiter] = pos 160 break 161 pos = line.find(delimiter, pos + 1) 162 # Find the first delimiter 163 mindelimiter = None 164 minpos = -1 165 for delimiter, pos in delimiters.iteritems(): 166 if pos == -1 or delimiter == u" ": 167 continue 168 if minpos == -1 or pos < minpos: 169 minpos = pos 170 mindelimiter = delimiter 171 if mindelimiter is None and delimiters.get(u" ", -1) != -1: 172 # Use space delimiter if we found nothing else 173 return (u" ", delimiters[" "]) 174 if mindelimiter is not None and u" " in delimiters and delimiters[u" "] < delimiters[mindelimiter]: 175 # If space delimiter occurs earlier than ":" or "=" then it is the 176 # delimiter only if there are non-whitespace characters between it and 177 # the other detected delimiter. 178 if len(line[delimiters[u" "]:delimiters[mindelimiter]].strip()) > 0: 179 return (u" ", delimiters[u" "]) 180 return (mindelimiter, minpos)
181
182 183 -def find_delimeter(line):
184 """Spelling error that is kept around for in case someone relies on it. 185 186 Deprecated.""" 187 warnings.warn("deprecated use Dialect.find_delimiter instead", DeprecationWarning) 188 return _find_delimiter(line, DialectJava.delimiters)
189
190 191 @accepts(unicode) 192 @returns(bool) 193 -def is_line_continuation(line):
194 """Determine whether L{line} has a line continuation marker. 195 196 .properties files can be terminated with a backslash (\\) indicating 197 that the 'value' continues on the next line. Continuation is only 198 valid if there are an odd number of backslashses (an even number 199 would result in a set of N/2 slashes not an escape) 200 201 @param line: A properties line 202 @type line: str 203 @return: Does L{line} end with a line continuation 204 @rtype: Boolean 205 """ 206 pos = -1 207 count = 0 208 if len(line) == 0: 209 return False 210 # Count the slashes from the end of the line. Ensure we don't 211 # go into infinite loop. 212 while len(line) >= -pos and line[pos:][0] == "\\": 213 pos -= 1 214 count += 1 215 return (count % 2) == 1 # Odd is a line continuation, even is not
216
217 218 @accepts(unicode) 219 @returns(unicode) 220 -def _key_strip(key):
221 """Cleanup whitespace found around a key 222 223 @param key: A properties key 224 @type key: str 225 @return: Key without any uneeded whitespace 226 @rtype: str 227 """ 228 newkey = key.rstrip() 229 # If line now end in \ we put back the whitespace that was escaped 230 if newkey[-1:] == "\\": 231 newkey += key[len(newkey):len(newkey)+1] 232 return newkey.lstrip()
233 234 dialects = {} 235 default_dialect = "java"
236 237 238 -def register_dialect(dialect):
239 dialects[dialect.name] = dialect
240
241 242 -def get_dialect(dialect=default_dialect):
243 return dialects.get(dialect)
244
245 246 -class Dialect(object):
247 """Settings for the various behaviours in key=value files.""" 248 name = None 249 default_encoding = 'iso-8859-1' 250 delimiters = None 251 pair_terminator = u"" 252 key_wrap_char = u"" 253 value_wrap_char = u"" 254
255 - def encode(cls, string):
256 """Encode the string""" 257 return quote.javapropertiesencode(string or u"")
258 encode = classmethod(encode) 259
260 - def find_delimiter(cls, line):
261 """Find the delimeter""" 262 return _find_delimiter(line, cls.delimiters)
263 find_delimiter = classmethod(find_delimiter) 264
265 - def key_strip(cls, key):
266 """Strip uneeded characters from the key""" 267 return _key_strip(key)
268 key_strip = classmethod(key_strip) 269
270 - def value_strip(cls, value):
271 """Strip uneeded characters from the value""" 272 return value.lstrip()
273 value_strip = classmethod(value_strip)
274
275 276 -class DialectJava(Dialect):
277 name = "java" 278 default_encoding = "iso-8859-1" 279 delimiters = [u"=", u":", u" "]
280 register_dialect(DialectJava)
281 282 283 -class DialectFlex(DialectJava):
284 name = "flex" 285 default_encoding = "utf-8"
286 register_dialect(DialectFlex)
287 288 289 -class DialectMozilla(Dialect):
290 name = "mozilla" 291 default_encoding = "utf-8" 292 delimiters = [u"="] 293
294 - def encode(cls, string):
295 return quote.mozillapropertiesencode(string or u"")
296 encode = classmethod(encode)
297 register_dialect(DialectMozilla)
298 299 300 -class DialectSkype(Dialect):
301 name = "skype" 302 default_encoding = "utf-16" 303 delimiters = [u"="] 304
305 - def encode(cls, string):
306 return quote.mozillapropertiesencode(string or u"")
307 encode = classmethod(encode)
308 register_dialect(DialectSkype)
309 310 311 -class DialectStrings(Dialect):
312 name = "strings" 313 default_encoding = "utf-16" 314 delimiters = [u"="] 315 pair_terminator = u";" 316 key_wrap_char = u'"' 317 value_wrap_char = u'"' 318
319 - def key_strip(cls, key):
320 """Strip uneeded characters from the key""" 321 newkey = key.rstrip().rstrip('"') 322 # If line now end in \ we put back the char that was escaped 323 if newkey[-1:] == "\\": 324 newkey += key[len(newkey):len(newkey)+1] 325 return newkey.lstrip().lstrip('"')
326 key_strip = classmethod(key_strip) 327
328 - def value_strip(cls, value):
329 """Strip uneeded characters from the value""" 330 newvalue = value.rstrip().rstrip(';').rstrip('"') 331 # If line now end in \ we put back the char that was escaped 332 if newvalue[-1:] == "\\": 333 newvalue += value[len(newvalue):len(newvalue)+1] 334 return newvalue.lstrip().lstrip('"')
335 value_strip = classmethod(value_strip) 336
337 - def encode(cls, string):
338 return string.replace('"', '\\"').replace("\n", r"\n").replace("\t", r"\t")
339 encode = classmethod(encode)
340 register_dialect(DialectStrings)
341 342 343 -class propunit(base.TranslationUnit):
344 """an element of a properties file i.e. a name and value, and any comments 345 associated""" 346
347 - def __init__(self, source="", personality="java"):
348 """construct a blank propunit""" 349 self.personality = get_dialect(personality) 350 super(propunit, self).__init__(source) 351 self.name = u"" 352 self.value = u"" 353 self.translation = u"" 354 self.delimiter = u"=" 355 self.comments = [] 356 self.source = source
357
358 - def setsource(self, source):
359 self._rich_source = None 360 source = data.forceunicode(source) 361 self.value = self.personality.encode(source or u"")
362
363 - def getsource(self):
364 value = quote.propertiesdecode(self.value) 365 return value
366 367 source = property(getsource, setsource) 368
369 - def settarget(self, target):
370 self._rich_target = None 371 target = data.forceunicode(target) 372 self.translation = self.personality.encode(target or u"")
373
374 - def gettarget(self):
375 translation = quote.propertiesdecode(self.translation) 376 translation = re.sub(u"\\\\ ", u" ", translation) 377 return translation
378 379 target = property(gettarget, settarget) 380
381 - def __str__(self):
382 """convert to a string. double check that unicode is handled somehow 383 here""" 384 source = self.getoutput() 385 assert isinstance(source, unicode) 386 return source.encode(self.personality.default_encoding)
387
388 - def getoutput(self):
389 """convert the element back into formatted lines for a .properties 390 file""" 391 notes = self.getnotes() 392 if notes: 393 notes += u"\n" 394 if self.isblank(): 395 return notes + u"\n" 396 else: 397 self.value = self.personality.encode(self.source) 398 self.translation = self.personality.encode(self.target) 399 value = self.translation or self.value 400 return u"%(notes)s%(key)s%(del)s%(value)s\n" % {"notes": notes, 401 "key": self.name, 402 "del": self.delimiter, 403 "value": value}
404
405 - def getlocations(self):
406 return [self.name]
407
408 - def addnote(self, text, origin=None, position="append"):
409 if origin in ['programmer', 'developer', 'source code', None]: 410 text = data.forceunicode(text) 411 self.comments.append(text) 412 else: 413 return super(propunit, self).addnote(text, origin=origin, 414 position=position)
415
416 - def getnotes(self, origin=None):
417 if origin in ['programmer', 'developer', 'source code', None]: 418 return u'\n'.join(self.comments) 419 else: 420 return super(propunit, self).getnotes(origin)
421
422 - def removenotes(self):
423 self.comments = []
424
425 - def isblank(self):
426 """returns whether this is a blank element, containing only 427 comments.""" 428 return not (self.name or self.value)
429
430 - def istranslatable(self):
431 return bool(self.name)
432
433 - def getid(self):
434 return self.name
435
436 - def setid(self, value):
437 self.name = value
438
439 440 -class propfile(base.TranslationStore):
441 """this class represents a .properties file, made up of propunits""" 442 UnitClass = propunit 443
444 - def __init__(self, inputfile=None, personality="java", encoding=None):
445 """construct a propfile, optionally reading in from inputfile""" 446 super(propfile, self).__init__(unitclass=self.UnitClass) 447 self.personality = get_dialect(personality) 448 self.encoding = encoding 449 self.filename = getattr(inputfile, 'name', '') 450 if inputfile is not None: 451 propsrc = inputfile.read() 452 inputfile.close() 453 self.parse(propsrc)
454
455 - def parse(self, propsrc):
456 """read the source of a properties file in and include them as units""" 457 newunit = propunit("", self.personality.name) 458 inmultilinevalue = False 459 if self.encoding is not None: 460 propsrc = unicode(propsrc, self.encoding) 461 else: 462 propsrc = unicode(propsrc, self.personality.default_encoding) 463 for line in propsrc.split(u"\n"): 464 # handle multiline value if we're in one 465 line = quote.rstripeol(line) 466 if inmultilinevalue: 467 newunit.value += line.lstrip() 468 # see if there's more 469 inmultilinevalue = is_line_continuation(newunit.value) 470 # if we're still waiting for more... 471 if inmultilinevalue: 472 # strip the backslash 473 newunit.value = newunit.value[:-1] 474 if not inmultilinevalue: 475 # we're finished, add it to the list... 476 self.addunit(newunit) 477 newunit = propunit("", self.personality.name) 478 # otherwise, this could be a comment 479 # FIXME handle /* */ in a more reliable way 480 # FIXME handle // inline comments 481 elif line.strip()[:1] in (u'#', u'!') or line.strip()[:2] in (u"/*", u"//") or line.strip()[:-2] == "*/": 482 # add a comment 483 newunit.comments.append(line) 484 elif not line.strip(): 485 # this is a blank line... 486 if str(newunit).strip(): 487 self.addunit(newunit) 488 newunit = propunit("", self.personality.name) 489 else: 490 newunit.delimiter, delimiter_pos = self.personality.find_delimiter(line) 491 if delimiter_pos == -1: 492 newunit.name = self.personality.key_strip(line) 493 newunit.value = u"" 494 self.addunit(newunit) 495 newunit = propunit("", self.personality.name) 496 else: 497 newunit.name = self.personality.key_strip(line[:delimiter_pos]) 498 if is_line_continuation(line[delimiter_pos+1:].lstrip()): 499 inmultilinevalue = True 500 newunit.value = line[delimiter_pos+1:].lstrip()[:-1] 501 else: 502 newunit.value = self.personality.value_strip(line[delimiter_pos+1:]) 503 self.addunit(newunit) 504 newunit = propunit("", self.personality.name) 505 # see if there is a leftover one... 506 if inmultilinevalue or len(newunit.comments) > 0: 507 self.addunit(newunit)
508
509 - def __str__(self):
510 """convert the units back to lines""" 511 lines = [] 512 for unit in self.units: 513 lines.append(str(unit)) 514 return "".join(lines)
515
516 517 -class stringsfile(propfile):
518 Name = _("OS X Strings") 519 Extensions = ['strings']
520 - def __init__(self, *args, **kwargs):
521 kwargs['personality'] = "strings" 522 super(stringsfile, self).__init__(*args, **kwargs)
523