This project is being developed alongside Dr. Kevin Scannell.
The current status of localization leaves much to be desired, to which any localizer who has ever had conflict with a website designer will attest. Presently, localization is an oddly local task; that is, translations are handled per-website. This leads to three issues:
These three aspects serve as significantly negative roadblocks to the continued effort to make the Internet more accessible.
We are aware of only one current solution,Dakwak, which aims to localize websites into any of over 60 languages. Unfortunately, this neither alleviates the need for site developer intervention nor provides a more natural user interface through which to submit the translations.
Although we may use machine translation services such as Apertium to fill in the gaps, the current state of machine translation is such that the quality is not always high enough for fully readable text, bringing us to the focus on collaborative human translation.
Ultimately, in addition to making the web a more accessible place, the goal is to use the human-provided translations to improve today’s machine learning systems. Some considerations will need to be taken into account in the implementation of the interface itself depending on the intended system to train. Apertium, for example, uses rule-based translation, so it may be wise to provide an optional third step which would allow the user to identify the meaning of words not yet in the Apertium dictionary.
This post is just intended to outline the general idea as it stands. I have more specific implementation details in mind and underway, so will touch back later to give more information. Any input is certainly welcome.