Wednesday, October 6, 2010

Google Translate spews profanities in Filipino

It may be funny, but it underscores a problem that's nothing to laugh about.

Making the rounds in social media today are screengrabs of errors produced using Google Translate that show bewilderingly profane Filipino translations for English-language scientific terms.

For example, Google Translate converts "titration" (a common chemistry lab procedure) into "pagsukat sa t***". Attempting to translate the noun's verb form, "titrate", yields the similarly profane-sounding but nonsensical "t***in".

Google Translate, an online language translation tool, also renders other scientific terms into less risqué but no less bewildering Filipino translations: "sublimation" becomes "pangingimbabaw"; "diffusion" is "pagsasabog"; "inorganic" is "tulagay".

Google Translate and other similar online tools (such as Yahoo! Babelfish) are inherently not designed to produce perfectly fluent translations. The results generated by online translators are often syntactically and grammatically incorrect, but are largely expected to be reasonable approximations of the meaning of the original text.

Google Translate, for its part, performs this task by comparing large volumes of online text and looking for similar patterns in a process called "statistical machine translation".



This means that English words don't necessarily have to have a direct Filipino translation; Google Translate just needs to have a large enough sample of human-translated documents for it to understand idioms and turns-of-phrase that are particular to the language being translated.

The fact that Google Translate seems to fail more often when translating scientific terms indicates the dearth of Filipino science-related documents online.

And this is certainly no laughing matter, according to Dr. Isagani Tapang, an associate professor of physics at the University of the Philippines in Diliman and chair of AGHAM.

"This is directly indicative of scientific output in the Philippines. Maliit na nga ang output, mas konti pa ang sinasalin sa Filipino (The output itself is small, and the number of Filipino translations is even smaller)," he said.

Tapang stressed the need for Filipino-language scientific articles for educational purposes. What's needed isn't necessarily direct word-for-word translations of scientific terms, he says, but rather instructional and educational materials that discuss science in the vernacular.

"For example, we know what a 'transistor' is even if it's an English term. But when we talk about it in the classroom, the discourse is in Filipino," he noted.

"You still need to report scientific developments in the vernacular, otherwise it will remain in the original language. Hindi na sya mababasa, so paano sya lalaganap? (It won't get read, so how will it spread?)," Tapang said.

He also pointed out that the effort need not encompass large bodies of text just yet. "Even just Filipino-language abstracts of scientific papers will be a great help," he opined.

Based on Tagalog Wikipedia statistics alone, as of Oct. 5, there are only 246 science articles out of over 20,000 on the site.

No comments:

Post a Comment

Followers