Machine Translation Archive

Subject Index

to methods, techniques, issues and topics

Publications since 2005

[Click for earlier publications: 2000-2004, 1990-1999, before 1990]

To return to home page click here

 

Alignment [see also Statistical analysis, Word alignment]

(2007) Dayne Freitag & Shahram Khadivi: A sequence alignment model based on the averaged perceptron. EMNLP-CoNLL-2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic; pp. 238-247. [PDF, 180KB]

(2007) Mary Hearne, John Tinsley, Ventsislav Zhechev, & Andy Way: Capturing translational divergences with a statistical tree-to-tree aligner. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.85-94 [PDF, 394KB]; presentation [PDF, 462KB]

(2007) Sarvnaz Karimi, Falk Scholer, & Andrew Turpin: Collapsed consonant and vowel models: new approaches for English-Persian transliteration and back-transliteration.  ACL 2007: proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, June 2007; pp. 648-655 [PDF, 195KB]

(2007) Jae Dong Kim & Stephan Vogel: Iterative refinement of lexicon and phrasal alignment. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.281-288 [PDF, 121KB]

(2007) Tadashi Kumano, Hideki Tanaka, & Takenobu Tokunaga: Extracting phrasal alignments from comparable corpora by using joint probability SMT model. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.95-103 [PDF, 366KB]

(2007) Patrik Lambert, Rafael E.Banchs, & Josep M.Crego: Discriminative alignment training without annotated data for machine translation. NAACL-HLT-2007 Human Language Technology: the conference of the North American Chapter of the Association for Computational Linguistics, 22-27 April 2007, Rochester, NY; Companion volume, pp.85-88 [PDF, 114KB]

(2007) Ding Liu & Daniel Gildea: Source-language features and maximum correlation training for machine translation evaluation. NAACL-HLT-2007 Human Language Technology: the conference of the North American Chapter of the Association for Computational Linguistics, 22-27 April 2007, Rochester, NY; pp.41-48 [PDF, 201KB]

(2007) Lieve Macken: Analysis of translational correspondence in view of sub-sentential alignment.  METIS-II Workshop: New Approaches to Machine Translation, Centre for Computational Linguistics, Katholieke Universiteit Leuven, Belgium, 11 January 2007; 9pp. [PDF, 182KB]

(2007) Toshiaki Nakazawa, Yu Kun, & Sadao Kurohashi: Structural phrase alignment based on consistency criteria. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.337-344 [PDF, 202KB]

(2007) Nasredine Semmar & Christian Fluhr: Arabic to French sentence alignment: exploration of a cross-language information retrieval approach. ACL 2007: proceedings of the Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources, Prague, Czech Republic, 28 June 2007; pp. 73-80 [PDF, 156KB]

(2007) John Tinsley, Ventsislav Zhechev, Mary Hearne, & Andy Way: Robust language pair-independent sub-tree alignment. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.467-474 [PDF, 1310KB]

(2007) Akira Ushioda: Phrase alignment for integration of SMT and RBMT resources. MT Summit XI Workshop on patent translation, 11 September 2007, Copenhagen, Denmark; pp.8-12. [PDF, 66KB]

(2007) Akira Ushioda: Phrase alignment based on bilingual parsing. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.241-250 [PDF, 312KB]

(2007) Min Zhang, Hongfei Jiang, Ai Ti Aw, Jun Sun, Sheng Li, & Chew Lim Tan: A tree-to-tree alignment-based model for statistical machine translation. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.535-542 [PDF, 243KB]

(2006) Alexandru Ceauşu, Dan Ştefănescu, & Dan Tufiş: Acquis Communautaire sentence alignment using support vector machines.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2134-2137 [PDF, 381KB]

(2006) Brooke Cowan, Ivona Kučerová, & Michael Collins: A discriminative model for tree-to-tree translation.  EMNLP-2006: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, July 2006; pp. 232-241. [PDF, 338KB]

(2006) Hal Daumé III & Daniel Marcu: Induction of word and phrase alignments for automatic document summarization. Computational Linguistics 31 (4), pp. 505-530. [PDF, 507KB]

(2006) John DeNero, Dan Gillick, James Zhang, & Dan Klein: Why generative phrase models underperform surface heuristics.  HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 31-38 [PDF, 393KB]

(2006)Yonggang Deng & William Byrne: MTTK: an alignment toolkit for statistical machine translation.  HLT-NAACL 2006: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, New York, NY, USA, June 2006; pp. 265-268 [PDF, 135KB]

(2006) Mark Hopkins & Jonas Kuhn: A framework for incorporating alignment information in parsing. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Cross-Language Knowledge Induction Workshop, Trento, Italy, April 3, 2006; pp.9-16 [PDF, 311 KB]

(2006) Radu Ion, Alexandru Ceauşu, & Dan Tufiş: Dependency-based phrase alignment.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1290-1293 [PDF, 430KB]

(2006) Ding Liu & Daniel Gildea: Stochastic iterative alignment for machine translation evaluation.  Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.539-546. [PDF, 201KB]

(2006) Xiaoyi Ma: Champollion: a robust parallel text sentence aligner.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.489-492 [PDF, 313KB]

(2006) Arne Mauser, Evgeny Matusov, & Hermann Ney: Training a statistical machine translation system without GIZA++.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.714-720 [PDF, 325KB]

(2006) Karolina Owczarzak, Declan Groves, Josef Van Genabith, & Andy Way: Contextual bitext-derived paraphrases in automatic MT evaluation. HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 86-93 [PDF, 222KB]

(2006) Gábor Pohl: English-Hungarian NP alignment in MetaMorpho TM. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.69-74 [PDF, 81KB]

(2006) J.A.Sánchez & J.M.Benedí: Stochastic inversion transduction grammars for obtaining word phrases for phrase-based statistical machine translation. HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 130-133 [PDF, 91KB]

(2006) Bettina Schrader: ATLAS – a new text alignment architecture. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.715-722. [PDF, 139KB]

(2006) Nasredine Semmar & Christian Fluhr: Using cross-language information retrieval for sentence alignment.  The Challenge of Arabic for NLP/MT. International conference at the British Computer Society, London, 23 October 2006; pp.95-104. [PDF, 254KB]

(2006) David A. Smith & Jason Eisner: Quasi-synchronous grammars: alignment by soft projection of syntactic dependencies.  HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 23-30 [PDF, 207KB]

(2006) Dan Ştefănescu & Dan Tufiş: Aligning multilingual thesauri. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.475-478 [PDF, 461KB]

(2006) David Vilar, Maja Popovic, & Hermann Ney: AER: do we need to “improve” our alignments? International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2006], November 27-28, 2006, Kyoto, Japan; pp. 205-212 [PDF, 108KB]

(2006) Benjamin Wellington, Sonja Waxmonsky, & I.Dan Melamed: Empirical lower bounds on the complexity of translational equivalence. Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.977-984. [PDF, 121KB]

(2006) Jia Xu, Richard Zens, & Hermann Ney: Partitioning parallel documents using binary segmentation. HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 78-85 [PDF, 433KB]

(2006) Hao Zhang & Daniel Gildea: Efficient search for inversion transduction grammar.  EMNLP-2006: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, July 2006; pp. 224-231. [PDF, 172KB]

(2005) proceedings of ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005

(2005) Yonggang Deng & William Byrne: HMM word and phrase alignment for statistical machine translation.  HLT-EMNLP-2005: Proceedings of Human Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, October 2005; pp. 169-176. [PDF, 189KB]

(2005) Emmanuel Giguet: Multi-grained alignment of parallel texts with endogenous resources. International workshop: Modern approaches in translation technologies, Borovets, Bulgaria, 24 September 2005; p.12-17 [PDF, 194KB]

(2005) Philipp Koehn: Europarl: a parallel corpus for statistical machine translation. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.79-86. [PDF, 123KB]

(2005) Takeshi Kutsumi, Takehiko Yoshimi, Katsunori Kotani, Ichiko Sata, & Hitoshi Isahara: Selection of entries for a bilingual dictionary from aligned translation equivalents using support vector machines. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.11-16. [PDF, 253KB]

(2005) Stephan Vogel: PESA: phrase pair extraction as sentence splitting. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.251-258. [PDF, 192KB]

(2005) Yujie Zhang, Kiyotaka Uchimoto, Qing Ma, & Hitoshi Isahara: Building an annotated Japanese-Chinese parallel corpus – a part of NICT multilingual corpora. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.71-78. [PDF, 1139KB]

(2005) Sanjika Hewavitharana, Stephan Vogel, & Alex Waibel: Augmenting a statistical translation system with a translation memory. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 126132. [PDF, 74KB]

(2005) Jae Dong Kim, Ralf D. Brown, Peter J. Jansen, & Jaime G. Carbonell: Symmetric probabilistic alignment for example-based translation. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 153-159. [PDF, 63KB]

(2005) Jia Xu, Richard Zens, & Hermann Ney: Sentence segmentation using IBM word alignment model 1. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 280-287. [PDF, 145KB]

(2005) Hao Zhang & Daniel Gildea: Stochastic lexicalized inversion transduction grammar for alignment. ACL-2005: 43rd Annual meeting of the Association for Computational Linguistics, University of Michigan, Ann Arbor, 25-30 June 2005; pp. 4787-482. [PDF, 102KB]

(2005) Ying Zhang & Stephan Vogel: An efficient phrase-to-phrase alignment model for arbitrarily long phrase and large corpora. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 294-301. [PDF, 187KB]

Analogies and analogical modelling (see also Example-based methods)

(2007) Etienne Denoual: Analogical translation of unknown words in a statistical machine translation framework. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.135-141 [PDF, 94KB]

(2007) Philippe Langlais & Alexandre Patry: Translating unknown words using analogical learning. EMNLP-CoNLL-2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic; pp. 877-886. [PDF, 155KB]

(2005) Yves Lepage & Etienne Denoual: ALEPH: an EBMT system based on the preservation of proportional analogies between sentences across langauges. International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2005], 24-25 October, 2005, Pittsburgh, PA, USA; 8pp. [PDF, 388KB]

Analysis see Parsing; Semantic analysis; Syntactic analyis

Anaphora resolution

(2005) Shigeko Nariyama, Eric Nichols, Francis Bond, Takaaki Tanaka, & Hiromi Nakaiwa: Extracting representative arguments from dictionaries for resolving zero pronouns. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.3-10. [PDF, 665KB]

Annotation

(2007) Matthias Buch-Kromann: Computing translation units and quantifying parallelism in parallel dependency treebanks. ACL 2007: proceedings of the Linguistic Annotation Workshop, Prague, Czech Republic, 28-29 June 2007; pp.69-76 [PDF, 361KB]

(2007) Philipp Koehn & Hieu Hoang: Factored translation models. EMNLP-CoNLL-2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic; pp. 868-876. [PDF, 225KB]

(2007) Lluís Mŕrquez, Luis Villarejo, M.A.Martí, & Mariona Taulé: SemEval-2007 task 09: multilevel semantic annotation of Catalan and Spanish. ACL 2007: proceedings of the 4th International  Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, 23-24 June 2007; pp.42-47 [PDF, 89KB]

(2007) Roser Morante & Bertjan Busser: ILK2: semantic role labelling for Catalan and Spanish using TiMBL. ACL 2007: proceedings of the 4th International  Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, 23-24 June 2007; pp.183-186 [PDF, 120KB]

(2007) Martin Volk, Joakim Lundborg, & Maël Mettler: A search tool for parallel treebanks. ACL 2007: proceedings of the Linguistic Annotation Workshop, Prague, Czech Republic, 28-29 June 2007; pp.85-92 [PDF, 270KB]

(2006) Juri Apresjan, Igor Boguslavsky, Boris Iomdin, Leonid Iomdin, Andrei Sannikov, & Victor Sizov: A syntactically and semantically tagged corpus of Russian: state of the art and prospects.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1378-1381 [PDF, 596KB]

(2006) Dan Flickinger: Identifying complex phenomena in a corpus via a treebank lens. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.125-129 [PDF, 89KB]

(2006) Eduard Hovy, Mitchell Marcus, Martha Palmer, Lance Ramshaw & Ralph Weischedel: OntoNotes: the 90% solution.  HLT-NAACL 2006: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, New York, NY, USA, June 2006; pp. 57-60 [PDF, 44KB]

(2006) Svetla Koeva, Svetlozara Lesseva, & Maria Todorova: Bulgarian sense tagged corpus.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. 5th SALTMIL Workshop on Minority Languages: “Strategies for developing machine translation for minority languages”, Genoa, Italy, 23 May 2006; pp.79-86. [PDF, 494KB]

(2006) Ivana Kruijff-Korbayová, Klára Chvátalová, & Oana Postolache: Annotation guidelines for Czech-English word alignment.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1256-1261 [PDF, 608KB]

(2006) Mohamed Maamouri, Ann Bies, & Seth Kulick: Diacritization: a challenge to Arabic treebank annotation and parsing.  The Challenge of Arabic for NLP/MT. International conference at the British Computer Society, London, 23 October 2006; pp.35-47 [PDF, 223KB]

(2006) Márton Miháltz & Gábor Pohl: Exploiting parallel corpora for supervised word sense disambiguation in English-Hungarian machine translation.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1294-1297 [PDF, 339KB]

(2006) Owen Rambow, Bonnie Dorr, David Farwell, Rebecca Green, Nizar Habash, Stephen Helmreich, Eduard Hovy, Lori Levin, Keith J.Miller, Teruko Mitamura, Florence Reeder, & Advaith Siddharthan: Parallel syntactic annotation of multiple languages.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.559-564 [PDF, 286KB]

(2006) E. Saquete, P.Martínez-Barco, R.Muńoz, M.Negri, M.Speranza, & R.Sprugnoli: Multilingual extension of a temporal expression normalizer using annotated corpora. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Cross-Language Knowledge Induction Workshop, Trento, Italy, April 3, 2006; pp.1-8 [PDF, 261 KB]

(2006) Serge Sharoff, Bogdan Babych, Paul Rayson, Olga Mudraya, & Scott Piao: ASSIST: automated semantic assistance for translators. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Posters and demonstrations, Trento, Italy, April 5-6, 2006; pp.139-142 [PDF, 69KB]

(2006) Yuk Wah Wong & Raymond J. Mooney: Learning for semantic parsing with statistical machine translation.  HLT-NAACL 2006: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, New York, NY, USA, June 2006; pp. 439-446 [PDF, 205KB]

(2005) Jesús Giménez & Lluís Mŕrquez: Combining linguistic data views for phrase-based SMT. ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 145-148. [PDF, 73KB]

(2005) Carol Nichols & Rebecca Hwa: Word alignment and cross-lingual resource acquisition. ACL-2005: Interactive Poster and Demonstration Sessions, University of Michigan, Ann Arbor, June 2005; pp. 69-72. [PDF, 331KB]

(2005) Yujie Zhang, Kiyotaka Uchimoto, Qing Ma, & Hitoshi Isahara: Building an annotated Japanese-Chinese parallel corpus – a part of NICT multilingual corpora. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.71-78. [PDF, 1139KB]

Applications of MT see index of applications

Aspect

(2007) Yang Ye, Karl-Michael Schneider, & Steven Abney: Aspect marker generation in English-to-Chinese machine translation. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.521-527 [PDF, 129KB]

Bilingual corpora [see also Example-based methods, Multilingual corpora]

(2007) Julia Aymerich & Hermes Camelo: Automatic extraction of entries for a machine translation dictionary using bitexts. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.21-27 [PDF, 88KB]

(2007) Matthias Buch-Kromann: Breaking the barrier of context-freeness: towards a linguistically adequate probabilistic dependency model of parallel texts. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.31-40 [PDF, 432KB]; presentation [PDF, 1285KB]; presentation [PDF, 1285KB]

(2007) Matthias Buch-Kromann: Computing translation units and quantifying parallelism in parallel dependency treebanks. ACL 2007: proceedings of the Linguistic Annotation Workshop, Prague, Czech Republic, 28-29 June 2007; pp.69-76 [PDF, 361KB]

(2007) Pablo Gamallo Otero: Learning bilingual lexicons from comparable English and Spanish corpora. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.191-197 [PDF, 509KB]

(2007) Ulrich Germann: Two tools for creating and visualizing sub-sentential alignments of parallel text. ACL 2007: proceedings of the Linguistic Annotation Workshop, Prague, Czech Republic, 28-29 June 2007; pp.121-124 [PDF, 247KB]

(2007) Xiaoguang Hu, Haifeng Wang, & Hua Wu: Using RBMT systems to produce bilingual corpus for SMT. EMNLP-CoNLL-2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic; pp. 287-295. [PDF, 131KB]

(2007) Masaki Itagaki, Takako Aikawa, & Xiaodong He: Automatic validation of terminology translation consistency with statistical method.  MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.269-274 [PDF, 416KB]

(2007) J. Howard Johnson, Joel Martin, George Foster & Roland Kuhn: Improving translation quality by discarding most of the phrasetable. EMNLP-CoNLL-2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic; pp. 967-975. [PDF, 182KB]

(2007) Heiki-Jaan Kaalep & Kaarel Veskis: Comparing parallel corpora and evaluating their quality. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.275-279 [PDF, 164KB]

(2007) Yajuan Lü, Jin Huang & Qun Liu: Improving statistical machine translation performance by training data selection and optimization. EMNLP-CoNLL-2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic; pp. 343-350. [PDF, 235KB]

(2007) Lieve Macken, Julia Trushkina, & Lidia Rura: Dutch parallel corpus: MT corpus and translator’s aid. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.313-320 [PDF, 132KB]

(2007) E.Morin, B.Daille, K.Takeuchi, & K.Kageura: Bilingual terminology mining – using brain, not brawn comparable corpora.  ACL 2007: proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, June 2007; pp. 664-671 [PDF, 143KB]

(2007) Jin’ichi Murakami, Masato Tokuhisa, & Satoru Ikehara: Statistical machine translation using large J/E parallel corpus and long phrase tables.  IWSLT 2007: International Workshop on Spoken Language Translation, 15-16 October 2007, Trento, Italy. 6pp. [PDF, 69KB]; presentation [PDF, 304KB]

(2007) Hwe Tou Ng & Yee Seng Chan: SemEval-2007 task 11: English lexical sample task via English-Chinese parallel text. ACL 2007: proceedings of the 4th International  Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, 23-24 June 2007; pp.54-58 [PDF, 84KB]

(2007) Chris Quirk, Raghavendra Udupa U., & Arul Menezes: Generative models of noisy translations with applications to parallel fragment extraction. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.377-384 [PDF, 249KB]

(2007) Masao Utiyama & Hitoshi Isahara: A Japanese-English patent parallel corpus. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.475-482 [PDF, 113KB]

(2007) Antal van den Bosch, Nicolas Stroppa, & Andy Way: A memory-based classification approach to marker-based EBMT.  METIS-II Workshop: New Approaches to Machine Translation, Centre for Computational Linguistics, Katholieke Universiteit Leuven, Belgium, 11 January 2007; 10pp. [PDF, 251KB]

(2007) Martin Volk, Joakim Lundborg, & Maël Mettler: A search tool for parallel treebanks. ACL 2007: proceedings of the Linguistic Annotation Workshop, Prague, Czech Republic, 28-29 June 2007; pp.85-92 [PDF, 270KB]

(2006) Ińaki Alegria, Nerea Ezeiza, & Izaskun Fernandez: Named entities translation based on comparable corpora. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Workshop on Multi-word expressions in a Multilingual Context, Trento, Italy, April 3, 2006; pp.1-8 [PDF, 455KB]

(2006) Saba Amsalu: Data-driven Amharic-English bilingual lexicon acquisition . LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.281-286 [PDF, 366KB]

(2006) Marco Baroni, Adam Kilgarriff, Jan Pomikálek, & Pavel Rychlý: WebBootCaT: instant domain-specific corpora to support human translators. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.247-252 [PDF, 191KB]

(2006) A.Casillas, A. Díaz de Illarraza, J.Igartua, R. Martínez, & K. Sarasola: Compilation and structuring of a Spanish-Basque parallel corpus.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. 5th SALTMIL Workshop on Minority Languages: “Strategies for developing machine translation for minority languages”, Genoa, Italy, 23 May 2006; pp.55-58. [PDF, 172KB]

(2006) Lea Cyrus: Building a resource for studying translation shifts.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.697-702 [PDF, 358KB]

(2006) Andreas Eisele: Parallel corpora and phrase-based statistical machine translation for new language pairs via multiple intermediaries.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.845-848 [PDF, 329KB]

(2006) Tomaž Erjavec: The English-Slovene ACQUIS corpus. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2138-2141 [PDF, 365KB]

(2006) Dan Flickinger: Identifying complex phenomena in a corpus via a treebank lens. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.125-129 [PDF, 89KB]

(2006) Beáta Bandmann Megyesi, Anna Sĺgvall Hein, & Éva Csató Johanson: Building a Swedish-Turkish parallel corpus.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2130-2133 [PDF, 692KB]

(2006) Márton Miháltz & Gábor Pohl: Exploiting parallel corpora for supervised word sense disambiguation in English-Hungarian machine translation.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1294-1297 [PDF, 339KB]

(2006) Dragos Stefan Munteanu & Daniel Marcu: Extracting parallel sub-sentential fragments from non-parallel corpora. Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.81-88. [PDF, 1598KB]

(2006) Dragos Stefan Munteanu & Daniel Marcu: Improving machine translation performance by exploiting non-parallel corpora. Computational Linguistics 31 (4), pp. 477-504 [PDF, 1060KB]

(2006) G. Craig Murray, Bonnie J. Dorr, Jimmy Lin, Jan Hajič, & Pavel Pecina: Leveraging recurrent phrase structure in large-scale ontology translation. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.141-150 [PDF, 686KB]

(2006) Sylwia Ozdowska: Projecting POS tags and syntactic dependencies from English and French to Polish in aligned corpora. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Cross-Language Knowledge Induction Workshop, Trento, Italy, April 3, 2006; pp.53-60 [PDF, 366 KB]

(2006) Michael Paul & Eiichiro Sumita: Exploiting variant corpora for machine translation.  HLT-NAACL 2006: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, New York, NY, USA, June 2006; pp. 113-116 [PDF, 86KB]

(2006) Alicia Pérez, Inés Torres, Francisco Casacuberta, & Víctor Guijarrubia: A Spanish-Basque weather forecast corpus for probabilistic speech translation.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. 5th SALTMIL Workshop on Minority Languages: “Strategies for developing machine translation for minority languages”, Genoa, Italy, 23 May 2006; pp.99-102. [PDF, 227KB]

(2006) Serge Sharoff, Bogdan Babych, & Anthony Hartley: Using comparable corpora to solve problems difficult for human translators. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.739-746. [PDF, 250KB]

(2006) Serge Sharoff, Bogdan Babych, & Anthony Hartley: Using collocations from comparable corpora to find translation equivalents.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.465-470 [PDF, 1104KB]

(2006) Serge Sharoff: Translation as problem-solving: uses of comparable corpora.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Third International Workshop on Language Resources for Translation Work, Research & Training (LR4Trans-III), Genoa, Italy, 28 May 2006; pp.23-28. [PDF, 914KB]

(2006) Tao Tao, Su-Youn Yoon, Andrew Fister, Richard Sproat, & ChengXiang Zhai: Unsupervised name entity transliteration using temporal and phonetic correlation.  EMNLP-2006: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, July 2006; pp. 250-257. [PDF, 139KB]

(2006) Gregor Thurmair: Using corpus information to improve MT quality. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Third International Workshop on Language Resources for Translation Work, Research & Training (LR4Trans-III), Genoa, Italy, 28 May 2006; pp.45-48. [PDF, 371KB]

(2006) Hitomi Tohyama & Shigeki Matsubara: Collection of simultaneous interpreting patterns by using bilingual spoken monologue corpus. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2564-2569 [PDF, 552KB]

(2006) Haifeng Wang, Hua Wu, & Zhanyi Liu: Word alignment for languages with scarce resources using bilingual corpora of other language pairs. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.874-881. [PDF, 155KB]

(2006) Xinglong Wang & David Martinez: Word sense disambiguation using automatically translated sense examples. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Cross-Language Knowledge Induction Workshop, Trento, Italy, April 3, 2006; pp.45-52 [PDF, 272 KB]

(2006) Benjamin Wellington, Sonja Waxmonsky, & I.Dan Melamed: Empirical lower bounds on the complexity of translational equivalence. Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.977-984. [PDF, 121KB]

(2006) Jia Xu, Richard Zens, & Hermann Ney: Partitioning parallel documents using binary segmentation. HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 78-85 [PDF, 433KB]

(2005) proceedings of ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005

(2005) Alison Alvarez, Lori Levin, Robert Frederking, Erik Peterson & Jeff Good: Semi-automated elicitation corpus generation . MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.388-395. [PDF, 131KB]

(2005) Naoki Asanoma, Setsuo Yamada, Osamu Furuse, & Masahiro Oku: Building a conversation corpus by text derivation from "germ dialogs". 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 27-32. [PDF, 82KB]

(2005) Colin Bannard & Chris Callison-Burch: Paraphrasing with bilingual parallel corpora. ACL-2005: 43rd Annual meeting of the Association for Computational Linguistics, University of Michigan, Ann Arbor, 25-30 June 2005; pp. 597-604. [PDF, 196KB]

(2005) Martin Čmejrek, Jan Cuřín, Jan Hajič, & Jiří Havelka: Prague Czech-English dependency treebank: resource for structure-based MT. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 73-78. [PDF, 66KB]

(2005) Etienne Denoual: The influence of example-data: homogeneity on EBMT quality MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Second Workshop on Example-Based Machine Translation; pp.35-42. [PDF, 404KB]

(2005) John Fry: Assembling a parallel corpus from RSS news feeds MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Second Workshop on Example-Based Machine Translation; pp.59-62. [PDF, 303KB]

(2005) Pablo Gamallo Otero: Extraction of translation equivalents from parallel corpora using sense-sensitive contexts. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 97-102. [PDF, 52KB]

(2005) Emmanuel Giguet: Multi-grained alignment of parallel texts with endogenous resources. International workshop: Modern approaches in translation technologies, Borovets, Bulgaria, 24 September 2005; p.12-17 [PDF, 194KB]

(2005) Ebba Gustavii: Target language preposition selection - an experiment with transformation based learning and aligned bilingual data. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 112-118. [PDF, 45KB]

(2005) Fei Huang, Ying Zhang, & Stephan Vogel: Mining key phrase translations from web corpora. HLT-EMNLP-2005: Proceedings of Human Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, October 2005; pp. 483-490. [PDF, 310KB]

(2005) Grzegorz Kondrak: Cognates and word alignment in bitexts. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.305-312. [PDF, 179KB]

(2005) Jonas Kuhn: Parsing word-aligned parallel corpora in a grammar induction context. ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 17-25. [PDF, 150KB]

(2005) Yves Lepage & Etienne Denoual: The ‘purest’ EBMT system ever built: no variables, no templates, no training, examples, just examples, only examples MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Second Workshop on Example-Based Machine Translation; pp.81-90. [PDF, 400KB]

(2005) Karin Müller: Revealing phonological similarities between related languages from automatically generated parallel corpora.  ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 33-40. [PDF, 114KB]

(2005) Sylwia Ozdowska: Using bilingual dependencies to align words in English/French parallel corpora.  ACL-2005: Student Research Workshop, University of Michigan, Ann Arbor, June 2005; pp. 127-132. [PDF, 89KB]

(2005) Isamu Okada, Shinichiro Miyazawa, Kazunari Ishida, Nobuhiko Shimizu, & Toshizumi Ohta: Quality analysis of patent parallel corpus by the scale MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Workshop on Patent Translation; pp.29-34. [PDF, 82KB]

(2005) Sitthaa Phaholphinyo, Teerapong Modhiran, Nattapol Kritsuthikul, & Thepchai Supnithi: A practical of memory-based approach for improving accuracy of MT. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.41-46. [PDF, 297KB]

(2005) Yujie Zhang, Kiyotaka Uchimoto, Qing Ma, & Hitoshi Isahara: Building an annotated Japanese-Chinese parallel corpus – a part of NICT multilingual corpora. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.71-78. [PDF, 1139KB]

(2005) Yujie Zhang, Qun Liu, Qing Ma, & Hitoshi Isahara: A multi-aligner for Japanese-Chinese parallel corpora. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.133-140. [PDF, 420KB]

Bi-text see Bilingual corpora

Book reviews

(2005) Derek Lewis: Books received. In: Machine Translation Review, issue 14: December 2005; pp.6-9.

Bootstrapping see index of applications

Bridge language see Intermediary (natural) language

Capitalization see Written forms

Case grammar and case frames

(2007) Pascale Fung, Zhaojun Wu, Yongsheng Yang, & Dekai Wu: Learning bilingual semantic frames: shallow semantic parsing vs. semantic role projection. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.75-84 [PDF, 477KB]

(2007) Kristina Toutanova & Hisami Suzuki: Generating case markers in machine translation. NAACL-HLT-2007 Human Language Technology: the conference of the North American Chapter of the Association for Computational Linguistics, 22-27 April 2007, Rochester, NY; pp.49-56 [PDF, 168KB]

(2005) Shigeko Nariyama, Eric Nichols, Francis Bond, Takaaki Tanaka, & Hiromi Nakaiwa: Extracting representative arguments from dictionaries for resolving zero pronouns. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.3-10. [PDF, 665KB]

Categorial grammar

(2007) Alexandra Birch, Miles Osborne, & Philipp Koehn: CCG supertags in factored statistical machine translation. ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 9-16 [PDF, 175KB]

(2007) Hany Hassan, Khalil Sima’an, & Andy Way: Supertagged phrase-based statistical machine translation.  ACL 2007: proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, June 2007; pp. 288-295 [PDF, 175KB]

(2007) Michael White, Rajakrishnan Rajkumar, & Scott Martin: Towards broad coverage surface realization with CCG. MT Summit XI Workshop: Using corpora for natural language generation: language generation and machine translation (UCNLG+MT), 11 September 2007, Copenhagen, Denmark; pp.22-30 [PDF, 2020KB]

Chart parsing

(2006) Andreas Zollmann & Ashish Venugopal: Syntax augmented machine translation via chart parsing.  HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 138-141 [PDF, 99KB]

Chunks and chunking see Segmentation

Closely related languages

(2007) Bogdan Babych, Anthony Hartley, & Serge Sharoff: Translating from under-resourced languages: comparing direct transfer against pivot translation. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.29-35 [PDF, 197KB]

(2007) A.Cüneyd Tantuğ, Eşref Adali, & Kemal Oflazer: A MT system from Turkmen to Turkish employing finite state and statistical methods. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.459-465 [PDF, 399KB]

(2007) A.Cüneyd Tantuğ, Eşref Adali, & Kemal Oflazer: Machine translation between Turkic languages. ACL 2007: proceedings of demo and poster sessions, Prague, Czech Republic, June 2007; pp. 189-192 [PDF, 135KB]

(2007) David Vilar, Jan-T. Peter, & Hermann Ney: Can we translate letters?  ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 33-39 [PDF, 124KB]

(2006) Carme Armentano i Oller & Mikel L. Forcada: Open-source machine translation between small languages.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. 5th SALTMIL Workshop on Minority Languages: “Strategies for developing machine translation for minority languages”, Genoa, Italy, 23 May 2006; pp.51-54. [PDF, 62KB]

(2006) Boštan Dvořák, Petr Homola, & Vladislav Kuboň: Exploiting similarity in the MT into a minority language.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. 5th SALTMIL Workshop on Minority Languages: “Strategies for developing machine translation for minority languages”, Genoa, Italy, 23 May 2006; pp.59-64. [PDF, 379KB]

(2006) Brock Pytlik & David Yarowsky: Machine translation for languages lacking bitext via multilingual gloss transduction. AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.156-165 [PDF, 267KB]

(2006) Kevin P. Scannell: Machine translation for closely related language pairs.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. 5th SALTMIL Workshop on Minority Languages: “Strategies for developing machine translation for minority languages”, Genoa, Italy, 23 May 2006; pp.103-107. [PDF, 139KB]

(2005) Antonio M.Corbi-Bellot, Mikel L. Forcada, Sergio Ortíz-Rojas, Juan Antonio Pérez-Ortiz, Gema Ramírez-Sánchez, Felipe Sánchez-Martínez, Ińaki Alegria, Aingeru Mayor, & Kepa Sarasola: An open-source shallow-transfer machine translation engine for the Romance languages of Spain. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 79-86. [PDF, 130KB]

(2005) Karin Müller: Revealing phonological similarities between related languages from automatically generated parallel corpora.  ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 33-40. [PDF, 114KB]

Cognates

(2007) Preslav Nakov & Marti Hearst: UCB system description for the WMT 2007 shared task<