Monday, March 30, 2009

TwitterFish: Bridging the Language Gap

A common problem for knowledge management programs -- especially those that span multiple countries or continents -- is bridging the gap between languages. People obviously feel more comfortable communicating in their native language and in many cases cannot communicate well -- if at all -- in other languages.

For formal documents such as white papers or reports, there is no easy solution to this problem. Automated translation services exist but the results are often rudimentary, at times amusing, and at worst they can actually be misleading or just plain wrong. There is very little choice but to do manual translations for important documents, insist that everyone communicate using one common language, and/or live with a Babel-like ignorance of the knowledge and expertise of other countries.

Because of their limited usefulness for published documents, automated translation services have been shunned by most KM programs. But are they really so bad? Or are there cases where automated translation is not only "good enough" but provides a vital missing link for multilingual teams?

I was recently working with an organization that operates in four different locations around the world, in four separate languages. Clearly, the language barrier is a significant obstacle for them. It turns out, however, that within each geographic region team members communicate frequently among themselves through IM and mobile texting.

The good news is that communication is happening. The bad news, from a knowledge management perspective, is that the language barrier has become a permanent wall separating groups of employees and the insights they hold.

The usual KM solution to this problem is to try and get each group to capture their learnings in whitepapers, reports, and other written documents. The problem of translating those documents is then addressed as a separate task. The language problem is exchanged for a translation problem and significant extra work for everyone. This is in addition to the many bright ideas and offhand stories that are lost in the move from conversation to written documents (i.e. implicit vs. explicit).

But if you take a step back, translating everything (or even a select portion identified as "important") before determining if it is actually going to be useful, is inefficient and almost guaranteed to be prohibitively expensive. What is really needed is to get a rough sense if something is of interest before making the effort to establish connections across the language boundary.

Which is exactly what automated translation is good at. Trying to follow a procedural document written in a foreign language -- or translated badly -- without other assistance can be both difficult and dangerous (depending on how risky mistakes are). But knowing that such knowledge exists, even if you can't read it all, can save hours or days trying to recreate the learnings that have already been captured.

What would we give for a way to "listen in" to conversations -- no matter what the language -- to see if there was either a discussion we could contribute to or knowledge we could use.

Well, we have ways to listen in through social computing. Forums, blogs, and microblogging move the one-to-one conversation to a broader social platform. Micro-blogging services in particular, such as Twitter, provide almost all of the immediacy and interaction of IM but to a much larger audience. All that is missing is the ability to read the different languages.

Which is where TwitterFish comes in. TwitterFish is a prototype to demonstrate the effectiveness of automated translation services for identifying potential points of useful information.

Twitter already provides a translation feature for its search interface. But the public timeline and the stream of your friends' updates do not. TwitterFish lets you select a language and translate all updates into that language on the fly. You can also click on a specific individual to see just their status updates, if you find something interesting.

The translations are still rough. You cannot use them alone. But the point is they give you window into what people are discussing in other languages that is not available in any other form. What's more, each message is associated with a person. So if you do find a piece of information you want to follow up on, you can start a conversation directly with those involved. Unlike translated documents, where the text is all you have, in social applications such as Twitter you have both the words and the people.

TwitterFish is just a prototype. Viewing the public timeline (the default) is interesting but not necessarily useful. However it does demonstrate the potential of automated translation services for dynamic data. The techniques used to create TwitterFish would be far more effective to groups bounded by a common interest. For example:

  • Apply TwitterFish to Yammer, the business version of Twitter, where only messages from within a single company are visible.
  • Create a Twitter account that "friends" a specific, global community of users, such as a professional organization. The accounts' stream can then be recast-- or displayed on the organization's web site -- translated into the viewer's language of choice.
  • Apply the same technique to other dynamic community content, such as forum posts, blog comments, etc.

As a final note, TwitterFish is a fairly simple application. It would not be possible without the generous availibility of a number of foundation services. Specifically:


Anonymous said...

I recently came accross your blog and have been reading along. I thought I would leave my first comment. I dont know what to say except that I have enjoyed reading. Nice blog. I will keep visiting this blog very often.


Bill Chapman said...

All very interesting. You don't mention Esperanto and its use as a common language. Take a look at