Speaker: Vamshi Ambati
Date: 15 July 2008
Abtract:
Syntax-based approaches to statistical MT require syntax-aware methods for acquiring their underlying translation models from parallel data. This acquisition process can be driven by syntactic trees for either the source or target language, or by trees on both sides. Work to date has demonstrated that using trees for both sides suffers from severe coverage problems. Approaches that project from trees on one side, on the other hand, have higher levels of recall, but suffer from lower precision, due to the lack of syntactically-aware word alignments.
In this talk I first discuss extraction and the lexical coverage of the translation models learned in both of these scenarios. We will specifically look at how the non-isomorphic nature of the parse trees for the two languages effects recall and coverage. I will then discuss a novel technique for restructuring target parse trees, that generates highly isomorphic target trees that preserve the syntactic boundaries of constituents that were aligned in the original parse trees. I will conclude by discussing some experimental evaluation with an English-French MT System.
Showing posts with label syntax. Show all posts
Showing posts with label syntax. Show all posts
Tuesday, July 15, 2008
Tuesday, November 13, 2007
Trees that can help
Speaker: Alok Parlikar
Title: (S (NP (NP Trees) (SBAR (WHNP that) (S (VP can)))) (VP help))
Summary:
For the past two months, I have been working with Alon Lavie and Stephan
Vogel, on Chinese and English parse-trees, to investigate answers to the
following questions:
(a) Can constituency information and word level alignments be used to
align nodes in trees of parallel sentences? How precisely matched
(in meaning) are the yields of these aligned nodes?
(b) Can the parse trees and word-level alignments be used for learning
reordering rules? If we use these rules to reorder source sentences,
can we do any better at translation?
The current results show that:
(a) - Node Alignments from hand-aligned data are very precise.
- Using automatic word-alignments to align nodes gives over 70%
precision and over 40% recall.
(b) Using a 10-best reordering of words in the source sentences, with
a "dumb" reordering strategy has shown a 0.005 improvement in BLEU
score.
I would like to talk about the approaches that we have taken here, and to
discuss about strategies for improving these results.
Title: (S (NP (NP Trees) (SBAR (WHNP that) (S (VP can)))) (VP help))
Summary:
For the past two months, I have been working with Alon Lavie and Stephan
Vogel, on Chinese and English parse-trees, to investigate answers to the
following questions:
(a) Can constituency information and word level alignments be used to
align nodes in trees of parallel sentences? How precisely matched
(in meaning) are the yields of these aligned nodes?
(b) Can the parse trees and word-level alignments be used for learning
reordering rules? If we use these rules to reorder source sentences,
can we do any better at translation?
The current results show that:
(a) - Node Alignments from hand-aligned data are very precise.
- Using automatic word-alignments to align nodes gives over 70%
precision and over 40% recall.
(b) Using a 10-best reordering of words in the source sentences, with
a "dumb" reordering strategy has shown a 0.005 improvement in BLEU
score.
I would like to talk about the approaches that we have taken here, and to
discuss about strategies for improving these results.
Subscribe to:
Posts (Atom)