Advertising

The Seattle Times Company

NWjobs | NWautos | NWhomes | NWsource | Free Classifieds | seattletimes.com

Business / Technology


Our network sites seattletimes.com | Advanced

Microsoft Pri0

Welcome to Microsoft Pri0: That's Microspeak for top priority, and that's the news and observations you'll find here from Seattle Times reporter Sharon Chan.

E-mail Sharon| RSS feeds Subscribe | Blog Home| Brier Dudley's Blog

July 30, 2008 10:58 AM

Microsoft Translator quietly making progress in machine translation

Posted by Benjamin J. Romano

I didn't have space in today's print story to share an update on Microsoft's translation efforts, which have been in the market -- quietly -- for nearly a year. The Microsoft Translator group is somewhat unique in that it's still part of the Microsoft Research team, even though it's a live product. That's also why it was on display at the Microsoft Research Faculty Summit yesterday.

Microsoft is going up against the established online translator -- Babel Fish, which Yahoo owns -- and Google Translate.

Lane Rau, marketing manager for Microsoft Translator, demonstrated a side-by-side Web page translator that could be particularly useful to people with familiarity, but not total comfort, in another language. The "bilingual viewer" shows an original Web page on one half of the screen and a translated version on the other half. Holding the mouse cursor over a block of text highlights it on both pages so you can compare the translated text with the original.

I've played around with Babel Fish and Google Translate a bit to see if they can do the same. If they can, it wasn't obvious to me.

Another view of Microsoft Translator shows the translated page in full and provides the original text in a small box when you hover the cursor over a text block.

It looks like a great tool for people learning a new language, and Rau said the service does get used that way.

But the purpose of the bilingual viewer is to make machine translation usable now, even as researchers continue to improve it.

"You can compare sentence by sentence," Rau said. "So, what that means is if there is an error, for example my name is Lane and that could be translated as street in another language ... you can go, oh, that was a name, so that's why that mistake happened. And you can actually understand it a lot better."

Microsoft is refining what's called statistical machine translation. It starts with millions of parallel sentences in language pairs gathered in a database. When a new sentence is entered for translation, the system looks through the database to select the most likely meaning or grammatical structure.

"It takes a ton of data -- millions of sentences in parallel," Rau said.

The company is relying on its huge archive of professionally translated software documentation and also exploring other sources such as World Health Organization translations, she said.

Statistical machine translation differs from rule-based translation, which relies on exhaustive sets of rules for each language, manually entered by humans. Babel Fish uses rule-based translation, powered by Systran, a French company.

Microsoft Translator also uses Systran for standard, non-technical translations in some languages, Rau said. "We're actually working on replacing that right now," she added.

Systran was previously used by Google Translate, but according to unofficial reports that surfaced last fall, Google switched off Systran and was using its own statistical machine translation.

Most users still find plenty to fault with machine translation, be it statistical or rules-based.

Both Microsoft and Google see major benefits in the statistical route.

"It takes a long time to develop a good enough database of rules," Rau said. "The nice thing about statistical translation is, once we get to that quality level -- we're working on that right now -- it actually can continue to improve and you can scale it across many more languages because you don't have to have the humans typing in manually different rules."

Right now, Microsoft Translator offers about a dozen language pairs.

The company hasn't done much marketing around its offering yet, focusing instead on improving the quality and integrating it into other products, such as Live Search. "We're working on integrating it into Office," Rau said.


Digg Digg | Newsvine Newsvine

Comments
No comments have been posted to this article.

Advertising

Marketplace

Lexus LF-C2 concept looks good in any lightnew
(The Associated Press) Lexus LF-C2 This open-air roadster concept debuted at the ongoing Los Angeles Auto Show and points to Lexus' future design dire...
Post a comment

Advertising

Advertising

Categories
Calendar

May

Sun Mon Tue Wed Thu Fri Sat
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31            
Browse the archives

May 2009

April 2009

March 2009

February 2009

January 2009

December 2008

From the tech blogosphere