Building A Translatable Website (Call For Developers)

By Brian McConnell
October 26, 2009

The Worldwide Lexicon is a collaborative, open source translation platform that combines inputs from professional, volunteer and machine translators. It is a best effort system that translates pages as they are served, using the best available resources at the time, as described in my recent essay, The End of the Language Barrier.

This article explains how you can incorporate the WWL API into your website, web app or favorite scripting language to make it translatable. We're also looking for developers to contribute libraries that make it easy to access WWL from popular scripting languages, CMSs, etc. If you're interested, email me at bsmcconnell /at/ gmail

How WWL Works

WWL is a translation memory, essentially a database of texts and their translations into various languages. WWL stores submissions from human translators along with meta data such as date/time, username, IP address, type of user, and so on. When you query it to request a translation for something, it will return one or many translations based on the filtering criteria you include with the query. It also proxies through to several machine translation services, so if no human translations are available, it will obtain a machine translation as a placeholder. It generally looks for professional or trusted translations first, then user translations, and then machine translations, although this behavior is configurable.

WWL is accessed via a few simple REST API calls to fetch translations, submit a translation, fetch or submit comments, and to score translations. With 3 or 4 API calls, you can implement a fairly complete translation service that is highly configurable, and can be accessed dynamically as needed.

Hello World : Translating Texts As A Page Is Served

We'll start with the standard Hello World example. Here we want to insert translations as a page is served. In this example, the web server has decided that translation is needed (e.g. by checking a cookie set by the user, browser preference, etc).

To request a translation from WWL, you can call one of two web APIs: /q (located at www.worldwidelexicon.org/q ) which returns a structured recordset of matching translations, and /t (located at www.worldwidelexicon.org/t ) which returns the best available translation (a sort of "I Feel Lucky" API). We'll use /t for this example.

This web service, located at www.worldwidelexicon.org/t , makes requesting translations super simple. You call it as simply as this:

www.worldwidelexicon.org/t/en/es/Hello

It will respond with the most recent or best available human translation, and fall back to a machine translation if none is available. You can include additional parameters, which are documented at www.worldwidelexicon.org/t As an added bonus, the default behavior of this web service is to reply with the translated text, followed by some demo HTML/Javascript that triggers a popup editor that allows the user to edit the translation (we will be updating the UI for this to make this trigger on a mouseover, and with a more compact dialog box). This, incidentally, is what web translation APIs should look like, but don't. Most translation services don't provide an API, and if they do, it's overly complicated for something as simple as submitting a source text and getting a translation in return.

Note to translation services:

Your API for requesting translations should be something simple, in the form of:

http://www.yourservice.com/t/[source language]/[target language]/[text]?[optional parms]

So, to insert a translation into a page as it is served, you simply call a function to obtain the best available translation via this interface and insert it in place of the original text. This is a simple example. To make this run efficiently, you'll want to do things like cache translations locally in memcached for example, so frequently translated texts are served from memory rather than require a call out to an external web service, but as you can see, the basic process is very simple.

NOTE: the default behavior of the /t service is to include a snippet of HTML/Javascript that triggers a popup editor that writes back to the WWL translation server. You can disable user edits by setting the parameter edit=n, or by setting output=text (to force a plain text response with no HTML). We are also working on improving the popup editor to make it appear like a generic dialog box that will fit will in most sites.

Example 2 : Enabling Users To Submit and Edit Translations

In the first example, the /t web service provides a simple editing tool. You can also make the translations editable with a simple callback to the /submit API (located at www.worldwidelexicon.org/submit ), so you can implement a custom editing interface using whatever design and interaction rules you want.

For example, you can support this by creating a popup HTML/Javascript form that appears when the user mouses over the translated text. The popup form displays the original text adjacent to an editable text box with the translated text, if a translation is available. This form either submits the translation to www.worldwidelexicon.org/submit (if it can write directly to the WWL server), or to a proxy script on your server that, in turn, echoes the edit to the /submit API. You can go to www.worldwidelexicon.org/submit to see documentation for what parameters it expects. It's a simple form submission.

You can control who can edit translations simply by deciding whether or not to include this popup form on a per-page or per-text basis, for example, to allow users to edit some parts of a page but not others.

This can be done in a number of ways, either by embedding some Javascript to trigger a popup form on mousing over a translation (if you prefer a "web 2.0" approach), or by linking the text or an edit icon to a popup form that in turn writes back to WWL. However you decide to implement the editor, all it needs to do is submit a form to the /submit web service.

Example 3 : Scoring Translations

In addition to allowing users to submit and edit translations, you can also allow them to score translations, so the system can learn which contributors are good. This is done by calling the /scores/vote API, which you call by opening the URL:

www.worldwidelexicon.org/scores/vote?guid=uniqueid&votetype=up|down|block&username=wwlusername&pw=wwlpw

It replies with a simple 'ok' or error message. The GUID is a record locator for the translation being stored (this is returned in calls to the /q API, which returns a structured list of translations). The system will look up the translation, who created it, and calculate their score. Summary scores are included in the translation recordsets returned by the /q interface.

Example 4 : Comments

WWL also includes a service to fetching and submitting comments about translations. This is a useful tool for enabling discussion and footnotes about translations. With this you can encourage users to discuss disagreements over a translation in the footnotes. This is also interesting as a way to encourage discussion about language, since the same idea may be expressed in different ways in different regions.

To fetch comments, you simply call /comments/get to request comments about translations, which can be keyed to a parent URL (to request all comments about translations for a particular page), to a specific source text, or a specific translation for a source text. It returns the comments in the output format of your choice (XML, JSON, RSS, etc) in date order.

To submit a comment, you simply call /comments/submit with a short list of parameters to identify which translation or source text the comment is about, the text of the comment, etc.

Filtering Translations and Controlling Access

WWL is based on an ad hoc process where it accepts submissions from anyone, and then filters them when a client submits a request for translations. For example, when submitting a request for translations, you can include the following filters:

  • allow_machine = y/n : allow machine translations (default = y)
  • allow_anonymous = y/n : include anonymous translations in results (default = y)
  • minimum_score = 0..5 : require minimum weighted quality score for translations
  • require_professional : require professional translation (default = n, if n, system will prioritize professional translations but fallback to user or machine translations)
  • users = comma separated list of whitelisted users to limit results to

In addition, you can apply your own rules to translations when using the /q interface because it includes a lot of meta data along with the translations (IP address, location, average score for translator, etc) which may be useful in making an allow/hide decision. In any case, we push decision making authority about how to display translations to the edge rather than impose centralized rules on everyone, because what works for one site may not be a good fit for another.

Call For Developers

WWL is an open source platform, with code available for both our cloud based translation memory, as well as interface tools, such as our Firefox Translator. Now that the API is complete, we're reaching out to developers to contribute extensions and libraries that make embedding WWL in other environments easy. We're looking for both low-level libraries, such as a convenience library for Ruby on Rails, as well as application level plugins, such as an extension for your favorite content management system.

Libraries

We'd like to offer a fairly broad selection of libraries that cover the programming languages used for web development. We primarily work in Python and Javascript, and have those covered, but are looking for developers who work in other environments, such as PHP, Perl, .Net, and so on. What we would like to provide users with is a simple single-file library that provides convenience functions that implement the major features in WWL, which are to fetch and submit translations (with local caching), fetch and submit comments, and to score translations, plus a handful of user related tasks (login, logout, create account). If you're proficient in an environment, you should be able to build something like this quickly, scratch your own itch, and help others out by contributing to the project.

Addons and Application Level Tools

The next step up from this is to embed WWL more deeply within popular web app and content management system environments, such as Word Press. For example, if you're proficient in a particular CMS, it would be great if you could contribute a plugin that makes adding WWL to that environment an easy install. This may be a simple or difficult task depending on the environment you're working in.

We are developing Javascript overlay tools that make it easy to apply translations as an overlay to pretty much any website, but this has a number of issues, so in the long-run, we'd like to have better integrated tools that generate translations on the web server (which is especially useful for search engine optimization).

Help Build The Multilingual Web

WWL is an ambitious project. Our goal is to make collaborative translation an embedded service that is built into most publishing and web app development environments. If we do this, translation will become a common feature throughout the web, and eventually it will become an ambient service that millions of people use, often without realizing it. It's fun stuff to work with, and could bring a lot of good to a lot of people. So if you'd like to help us make that happen, drop me a line at bsmcconnell /at/ gmail to learn more.


You might also be interested in:

News Topics

Recommended for You

Got a Question?