The Worldwide Lexicon is a collaborative, open source translation platform that combines inputs from professional, volunteer and machine translators. It is a best effort system that translates pages as they are served, using the best available resources at the time, as described in my recent essay, The End of the Language Barrier.
This article explains how you can incorporate the WWL API into your website, web app or favorite scripting language to make it translatable. We're also looking for developers to contribute libraries that make it easy to access WWL from popular scripting languages, CMSs, etc. If you're interested, email me at bsmcconnell /at/ gmail
How WWL Works
WWL is a translation memory, essentially a database of texts and their translations into various languages. WWL stores submissions from human translators along with meta data such as date/time, username, IP address, type of user, and so on. When you query it to request a translation for something, it will return one or many translations based on the filtering criteria you include with the query. It also proxies through to several machine translation services, so if no human translations are available, it will obtain a machine translation as a placeholder. It generally looks for professional or trusted translations first, then user translations, and then machine translations, although this behavior is configurable.
WWL is accessed via a few simple REST API calls to fetch translations, submit a translation, fetch or submit comments, and to score translations. With 3 or 4 API calls, you can implement a fairly complete translation service that is highly configurable, and can be accessed dynamically as needed.
Hello World : Translating Texts As A Page Is Served
We'll start with the standard Hello World example. Here we want to insert translations as a page is served. In this example, the web server has decided that translation is needed (e.g. by checking a cookie set by the user, browser preference, etc).
To request a translation from WWL, you can call one of two web APIs: /q (located at www.worldwidelexicon.org/q ) which returns a structured recordset of matching translations, and /t (located at www.worldwidelexicon.org/t ) which returns the best available translation (a sort of "I Feel Lucky" API). We'll use /t for this example.
This web service, located at www.worldwidelexicon.org/t , makes requesting translations super simple. You call it as simply as this:
Note to translation services:
Your API for requesting translations should be something simple, in the form of:
http://www.yourservice.com/t/[source language]/[target language]/[text]?[optional parms]
So, to insert a translation into a page as it is served, you simply call a function to obtain the best available translation via this interface and insert it in place of the original text. This is a simple example. To make this run efficiently, you'll want to do things like cache translations locally in memcached for example, so frequently translated texts are served from memory rather than require a call out to an external web service, but as you can see, the basic process is very simple.
Example 2 : Enabling Users To Submit and Edit Translations
In the first example, the /t web service provides a simple editing tool. You can also make the translations editable with a simple callback to the /submit API (located at www.worldwidelexicon.org/submit ), so you can implement a custom editing interface using whatever design and interaction rules you want.
You can control who can edit translations simply by deciding whether or not to include this popup form on a per-page or per-text basis, for example, to allow users to edit some parts of a page but not others.
Example 3 : Scoring Translations
In addition to allowing users to submit and edit translations, you can also allow them to score translations, so the system can learn which contributors are good. This is done by calling the /scores/vote API, which you call by opening the URL:
It replies with a simple 'ok' or error message. The GUID is a record locator for the translation being stored (this is returned in calls to the /q API, which returns a structured list of translations). The system will look up the translation, who created it, and calculate their score. Summary scores are included in the translation recordsets returned by the /q interface.
Example 4 : Comments
WWL also includes a service to fetching and submitting comments about translations. This is a useful tool for enabling discussion and footnotes about translations. With this you can encourage users to discuss disagreements over a translation in the footnotes. This is also interesting as a way to encourage discussion about language, since the same idea may be expressed in different ways in different regions.
To fetch comments, you simply call /comments/get to request comments about translations, which can be keyed to a parent URL (to request all comments about translations for a particular page), to a specific source text, or a specific translation for a source text. It returns the comments in the output format of your choice (XML, JSON, RSS, etc) in date order.
To submit a comment, you simply call /comments/submit with a short list of parameters to identify which translation or source text the comment is about, the text of the comment, etc.
Filtering Translations and Controlling Access
WWL is based on an ad hoc process where it accepts submissions from anyone, and then filters them when a client submits a request for translations. For example, when submitting a request for translations, you can include the following filters:
- allow_machine = y/n : allow machine translations (default = y)
- allow_anonymous = y/n : include anonymous translations in results (default = y)
- minimum_score = 0..5 : require minimum weighted quality score for translations
- require_professional : require professional translation (default = n, if n, system will prioritize professional translations but fallback to user or machine translations)
- users = comma separated list of whitelisted users to limit results to
In addition, you can apply your own rules to translations when using the /q interface because it includes a lot of meta data along with the translations (IP address, location, average score for translator, etc) which may be useful in making an allow/hide decision. In any case, we push decision making authority about how to display translations to the edge rather than impose centralized rules on everyone, because what works for one site may not be a good fit for another.
Call For Developers
WWL is an open source platform, with code available for both our cloud based translation memory, as well as interface tools, such as our Firefox Translator. Now that the API is complete, we're reaching out to developers to contribute extensions and libraries that make embedding WWL in other environments easy. We're looking for both low-level libraries, such as a convenience library for Ruby on Rails, as well as application level plugins, such as an extension for your favorite content management system.
Addons and Application Level Tools
The next step up from this is to embed WWL more deeply within popular web app and content management system environments, such as Word Press. For example, if you're proficient in a particular CMS, it would be great if you could contribute a plugin that makes adding WWL to that environment an easy install. This may be a simple or difficult task depending on the environment you're working in.
Help Build The Multilingual Web
WWL is an ambitious project. Our goal is to make collaborative translation an embedded service that is built into most publishing and web app development environments. If we do this, translation will become a common feature throughout the web, and eventually it will become an ambient service that millions of people use, often without realizing it. It's fun stuff to work with, and could bring a lot of good to a lot of people. So if you'd like to help us make that happen, drop me a line at bsmcconnell /at/ gmail to learn more.