Brad Fitzpatrick (brad) wrote,
Brad Fitzpatrick
brad

Thoughts on language translation

I was talking to toast about this a little before class today. The reason that most machine translation sucks is that it tries to do everything in one step without giving the end user a chance to help out in the middle step.

See, translation isn't as easy as mapping a set of source words onto the equivalent set in the destination language. (well, the US Government thought so during the Cold War and funded a bunch of efforts to auto-translate Russian military texts....)

There are two steps in translation: converting the input text to a rich interlingua, then that interlingua into the destination language.

What I want is to be able to either enter text directly in the interlingua (which will be painful and slow, but accurate) or better, let the translator show me the interlingua it generated, let me correct errors it might've made from lack of context and/or intelligence, then do the interlingua to destination language.

That last step is really easy once you have the correct parse tree of the sentence. And that's the part I want available to me that Babelfish, say, doesn't provide. So I'm now interested in starting my own unless there's a good open project out there already I can help with.

On the other hand, that first step is interesting too ... with Dan Brian's work on tagging words with their correct wordnet sentences, that makes generating parse trees all the more easy ... the input tokens actually are meaningful now (words by themselves are useless).

Anyway, all very interesting. Wish I had 50 hours a day to play.

Update: this looks kinda cool, but nothing's there yet. I should join the mailing lists.
Subscribe
  • Post a new comment

    Error

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 0 comments