We could use your help. If you are interesting in contributing then please join us on IRC on #pootle and on the translate-devel mailing list.
Here are some idea of how you can contribute
Below we give you more detail on these:
Before we release new versions of the Toolkit we need people to check that they still work correctly. If you are a frequent user you might want to start using the release candidate on your current work and report any errors before we release them.
Compile and install the software to see if we have any platform issues
./setup.py install
Check for any files that are missing, tools that were not installed, etc.
Run unit tests to see if there are any issues. You'll need the py.test unit test software which you can download and install or install this Fedora RPM
To run the tests you do
py.test
from the translate src directory, or
py.test storage/test_dtd.py
to run a specific set of tests.
Report any failures.
Note: If your Pootle tests fail with an error along the lines of:
URLError: <urlopen error (-2, 'Name or service not known')>
Check your http_proxy environment variable. Usually it helps just to unset it:
unset http_proxy
Finally, simply work with the software. Checking all your current usage patterns and report problems.
Now you need to try and validate the bug. Your aim is to confirm that the bug is either fixed, is invalid or still exists.
If its fixed please close the bug and give details of how when it was fixed or what version you used to validate it as corrected.
If you find that the bug reporter has made the incorrect assumptions or their suggestion cannot work. Then mark the bug as invalid and give reasons why.
The last case, an existing bug is the most interesting. Check through the bug and do the following:
Don't ignore this area if you feel like your not a hotshot coder!
You will need some Python skills, this is a great way to learn.
Here are some ideas to get you going:
You will definitely need to be on the Development and probably on the Subversion checkin lists.
Now is the time to familiarise yourself with the developers guide.
This is the easy one. Login to the wiki and start!
The key areas that need to be looked at are:
After that and always:
This section is for any kind of feature request or wish list items. If you have any idea for something (anything) to implement, you may list it here.
I often get abbreviations in the text that I know might be written in full form elsewhere in the text, but I can't guess what it might be. So how about a tool that will search for possible full forms of abbreviations. Input VFS, and search for stuff like Virtual File Server, virtual file server, etc. Also add stopword list so that short words inside full forms can be ignored.
pogrep -I --accelerator="~" --search=source -e "\bv.+\bm.+\bl" source.po vml.po
This is a hack that will work for you now. It searches in the source (msgid or source tu), ignores case and searches for a structure of words that start with V them M then L. It wouldn't find XML - eXtensible Markup Language.
An aligner would be nice. At the moment, po2tmx can create a TMX file, but the TUs in the TMX often contain more than one sentence. A TM is more useful to translators if it contains sentences, not paragraphs. So, how about a tool that takes such a TMX and attempts to converts each one into sentences (source and target). The output can be CSV, so that a translator can open it in a graphical CSV editor and correct misalignments. The output can even be a plaintext tab delimited file so that one can open it as a table in OpenOffice and use shortcuts to correct the alignment.
The idea would be to keep the first sentence of each TU aligned. This means that if a previous TU had dissimilar number of sentences in the source and target, there would be empty cells above the current TU (either in the source or in the target).
As ek pomerge probeer doen met n lêer wat op .po.txt eindig, dan wil hy nie. kan dalk nuttig vir windows gebruikers wees, omdat baie windows programme outomaties .txt agteraan las en dan moet die arme gebruiker dit eers weer gaan uitvee voor pomerge tevrede is.
I wish the Toolkit could export to table in a wordprocessing document and reimport from a table in a wordprocessing document. This would make it possible for almost anyone to help translate in a Toolkit based project. The best table format is probably an OpenDocument table, so that MS Word users can use it without screwing it up too much. The table can have three columns (or more, but additional columns are ignored -- possibly to be used by the proofreader for notes to the translator, etc).
If ODT is too difficult, how about exporting to a three column in HTML? An HTML file can be opened in WYSIWIG in MS Word and OpenOffice.org, and although the saved HTML file will have horrible machine generated code to give anyone in alt.html.critique a heart attack, it will still be a valid table which can simply be converted back to PO.
po2csv isn't really feasible because different programs have different CSV definitions. Excel and Calc interpret a CSV file in two different ways. So exporting the PO to CSV only works for tools that can correctly interpret the Toolkit's chosen dialect of CSV.
The underlying toolkit CSV module supports various flavours of CSV. It currently uses the Excel flavour. So it is possible to output for different spreadsheets if needed. It might be better to understand what exactly fails, I know we had to hack things to prevent the loss of leading single quotes which in most word processors are interpreted as meaning 'treat this as text' --- Dwayne Bailey 2007/10/18 03:13
Well, Excel doesn't convert cleanly to a word processing format and back. Excel wasn't designed as a text editing tool anyway. The “normal” program to edit text in, surely, is a word processor. -- Samuel
It would be nice if one could do a pofilter check that takes a list of words from an input file and checks to see if they occur in the target text. This list of words would be a blacklist of terms that should not be used in the translation, no matter what. Useful for when a client decides to change his prescribed terminology and you want to do a bulk pogrep on your files while keeping the blacklist of words all in one place.
The blacklist should respect word boundaries. So if “klik” is on the blacklist, then “toeganklik” should not trigger it.
What I mean is that a pofilter check should take a bilingual list of words as input file, and check to see if a term in the source text was translated using using the right word in the target text. In other words, if the bilingual list contains:
computer = rekenaar
then pofilter will check which source texts contain 'computer' and the check if all of their corresponding target texts contain 'rekenaar'. Those that don't, fail the check.
The bilingual list check should not respect word boundaries (or: should do a fuzzy check), so that “rekenaar” would also match “rekenaars” and “berekenaar”.
Okay, this is a Pootle wish list item… perhaps it belongs elsewhere. Currently, you can do a search for a word in Pootle, and Pootle jumps to the next instance of that word. The advantage of this, is context. The disadvantage is not seeing all the instances in one page. I suggest the following:
Let there be a tickbox option next to the search box in Pootle whereby a search result opens in a new browser window. The result can be a normal plaintext PO file that would be the normal result of a pogrep action (but a single page, not multiple pages), or… it can be a simple HTML file in which the search term is highlighted in each string.
The current search system assumes that the user might want to edit the strings that form the search result. The purpose of the proposed system would be to do quick searches on term usage, but not enable users to edit the strings there and then.
(Actually… well… it's a pity that the files in Pootle are nested in a directory tree, otherwise this search method could open multiple pages that advanced users can download, edit, and upload if they wanted.)
When translating some expression on pootle, for many times I use to check how an expression have been translated on other languages (by accessing the same link on other languages, just changing the suffix of the lang - i.e.: 'es', 'fr', 'pt_BR').
On this idea, i can imagine that a user would be allowed to define “alternate languages” in it's preferences. On translation interface, small links would appear, just like:
… pointing to a link of the translation of that string on the language. This could raise an popup, a simple link or something else that I'm not sure about the better layout.
I think that anything that helps the translation, as this feature, would be nice for increasing the quality and speed of the work.