Title: Models for Synchronous Grammar Induction for Statistical Machine Translation
Presenters: Chris Dyer, LTI& Desai Chen, CSD undergraduate
Tuesday, September 14, at Noon to 1:30pm in GHC 6501.
Abstract: The last decade of research in Statistical Machine Translation (SMT) has seen rapid progress. The most successful methods have been based on synchronous context free grammars (SCFGs), which encode translational equivalences and license reordering between tokens in the source and target languages. Yet, while closely related language pairs can be translated with a high degree of precision now, the result for distant pairs is far from acceptable. In theory, however, the "right"' SCFG is capable of handling most, if not all, structurally divergent language pairs. This talk will report on the results of the 2010 Language Engineering Workshop held at Johns Hopkins University that the goal to focus on the crucial practical aspects of acquiring such SCFGs from bilingual, but otherwise unannotated, text. We started with existing algorithms for inducing unlabeled SCFGs (e.g. the popular Hiero model) and then used unsupervised learning methods to refine the syntactic constituents used in the translation rules of the grammar.
MT Lunch Seminar Series is an informal discussion group where researchers in the area of Machine Translation present their research and seek feedback from the MT groups at CMU. Talks are scheduled for the 2nd Tuesday of the month at NOON in GHC 4405, unless otherwise mentioned.