|
|
|
|
|
A report sponsored by the Translation Automation Users Society (TAUS)
Asia Online was commissioned by TAUS to show the effect of combining data from multiple
companies for the purpose of building statistical machine translation (SMT) engines. Three
TAUS members from the same industry volunteered to provide translation memories
and other linguistic assets to enable Asia Online to conduct the tests described
in this report.
Asia Online performed extensive analysis and created a total of
29 separate SMT engines by combining the data from the three companies in various
configurations, as well as performing comparisons of output quality (using the BLEU
and the F-Measure metric) of all the systems that were built. Additionally, a quality
comparison was also made with the output produced by Google and Systran so that
there is an overall perspective on how these systems compare.
Great care was taken to ensure that the key input into the measurement systems was consistent and clean,
to ensure that the final BLEU scores are meaningful and indicative of likely real-life
experience results of data consolidation.
The full report can be downloaded by filling
in the form registration form below.
|
|
|