Machine Translation Archive

Subject Index

to methods, techniques, issues and topics

Publications 2000 to 2004

 

Click here for publications since 2005

Click here for indexes to publications 1990-1999 and publications before 1990

To return to home page click here

 

Adjectives

(2001) Taiichi Hashimoto, Kosuke Nishidate, Kiyoaki Shirai, Takenobu Tokunaga & Hozumi Tanaka: Decision lists for determining adjective dependency in Japanese. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp. 151-156. [PDF, 133KB]

Alignment [see also Statistical analysis, Word alignment]

(2004) Oliver Bender, Richard Zens, Evgeny Matusov & Hermann Ney: Alignment templates: the RWTH SMT system. International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2004], September 30 – October 1, 2004, Kyoto, Japan; pp. 79-84 [PDF, 193KB]

(2004) Percy Cheung & Pascale Fung: Sentence alignment in parallel, comparable, and quasi-comparable corpora.  LREC-2004. Workshop, 25th May 2004: The amazing utility of parallel and comparable corpora; pp. 30-33. [PDF, 323KB]

(2004) Hal Daumé III & Daniel Marcu: A phrase-based HMM approach to document/abstract alignment.  EMNLP-2004: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 25-26 July 2004, Barcelona, Spain; 8pp. [PDF, 111KB]

(2004) Pascale Fung & Percy Cheung: Multi-level bootstrapping for extracting parallel sentences from a quasi-comparable corpus.  Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 245KB]

(2004) Pascale Fung & Percy Cheung: Mining very-non-parallel corpora: parallel sentence and lexicon extraction via bootstrapping and EM. EMNLP-2004: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 25-26 July 2004, Barcelona, Spain; 8pp. [PDF, 276KB]

(2004) Daniel Gildea:  Dependencies vs constituents for tree-based alignment. EMNLP-2004: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 25-26 July 2004, Barcelona, Spain; 8pp. [PDF, 97KB]

(2004) Adrià de Gispert, José B. Mariño & Josep M. Crego: Phrase-based alignment combining corpus cooccurrences and linguistic knowledge. International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2004], September 30 – October 1, 2004, Kyoto, Japan; pp. 107-114 [PDF, 178KB]

(2004) Declan Groves, Mary Hearne, & Andy Way: Robust sub-sentential alignment of phrase-structure trees. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 94KB]

(2004) Paul Kingsbury, Nianwen Xue, & Martha Palmer: Propbanking in parallel.  LREC-2004. Workshop, 25th May 2004: The amazing utility of parallel and comparable corpora; pp. 34-37. [PDF, 305KB]

(2004) Patrick Lambert & Núria Castell: Alignment of parallel corpora exploiting asymmetrically aligned phrases.  LREC-2004. Workshop, 25th May 2004: The amazing utility of parallel and comparable corpora; pp. 26-29. [PDF, 239KB]

(2004) Chris Pike & I.Dan Melamed: An automatic filter for non-parallel texts. ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the interactive poster and demonstration sessions, 21-26 July 2004, Barcelona, Spain; 4pp. [PDF, 128KB]

(2004) Jui-Feng Yeh, Chung-Hsien Wu, Ming-Jun Chen, & Liang-Chih Yu: Automated alignment and extraction of bilingual ontology for cross-language domain-specific applications. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 196KB]

(2004) Hao Zhang & Daniel Gildea: Syntax-based alignment: supervised or unsupervised? Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 87KB]

(2003) Eiji Aramaki, Sadao Kurohashi, Hideki Kashioka, & Hideki Tanaka: Word selection for EBMT based on monolingual similarity and translation confidence HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 1565KB]

(2003) Regina Barzilay & Noemie Elhadad: Sentence alignment for monolingual comparable corpora EMNLP-2003: proceedings of the 2003  conference on Empirical Methods in Natural Language Processing, a meeting of SIGDAT, a special interest group of the ACL, held in conjunction with ACL-03,  11-12 July  2003, Sapporo, Japan; 8pp. [PDF, 94KB]

(2003) Daniel Gildea: Loosely tree-based alignment for machine translation ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 109KB]

(2003) Hideki Kashioka, Takehiko Maruyama, & Hideki Tanaka: Building a parallel corpus for monologues with clause alignment MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.216-223. [PDF, 379KB]

(2003) Chun-Jen Lee & Jason S. Chang: Acquisition of English-Chinese transliterated word pairs from parallel-aligned texts using a statistical machine transliteration model HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 344KB]

(2003) Joel Martin, Howard Johnson, Benoit Farley, and Anna Maclachlan: Aligning and using an English-Inuktitut parallel corpus HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 30KB]

(2003) I.Dan Melamed: Multitext grammars and synchronous parsers HLT-NAACL 2003: conference combining Human Language Technology conference series and the North American Chapter of the Association for Computational Linguistics conference series,  May 27 – June 1,  2003, Edmonton, Canada; 8pp. [PDF, 1126KB]

(2003) Francisco Nevado, Francisco Casacuberta, & Enrique Vidal: Parallel corpora segmentation using anchor words. 7th EAMT Workshop, "Improving machine translation through other language technology tools", 13 April 2003, Budapest, Hungary; pp. 33-40 [PDF, 382KB]

(2003) Stephen Nightingale & Hideki Tanaka: Comparing the sentence alignment yield from two news corpora using a dictionary-based alignment system HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 142KB]

(2003) Franz Josef Och & Hermann Ney: A systematic comparison of various statistical alignment models. Computational Linguistics 29 (1), pp.19-51 [PDF, 296KB]

(2003) Bo Pang, Kevin Knight, & Daniel Marcu: Syntax-based alignment of multiple translations: extracting paraphrases and generating new sentences HLT-NAACL 2003: conference combining Human Language Technology conference series and the North American Chapter of the Association for Computational Linguistics conference series,  May 27 – June 1,  2003, Edmonton, Canada; 8pp. [PDF, 91KB]

(2003) Emmanuel Planas & Osamu Furuse: Formalizing translation memory. In: Michael Carl & Andy Way (eds.) Recent advances in example-based machine translation (Dordrecht: Kluwer Academic Publishers, 2003), pp. 157-188.

(2003) Lee Schwartz, Takako Aikawa, & Chris Quirk: Disambiguation of English PP attachment using multilingual aligned data MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.330-337. [PDF, 98KB]

(2003) Michel Simard: Translation spotting for translation memories HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 120KB]

(2003) Michel Simard & Philippe Langlais Statistical translation alignment with compositionality constraints HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 65KB]

(2003) Ashish Venugopal, Stephan Vogel, & Alex Waibel: Effective phrase translation extraction from aligned models ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 134KB]

(2003) Jian-Cheng Wu, Kevin C.Yeh, Thomas C.Chuang, Wen-Chi Shei, & Jason S.Chang: TotalRecall: a bilingual concordance for computer assisted translation and language learning ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 76KB]

(2003) Bing Zhao, Klaus Zechner, Stephen Vogel, & Alex Waibel: Efficient optimization for bilingual sentence alignment based on linear regression HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 93KB]

(2002) Mosleh H.Al-Adhaileh, Tang Enya Kong, & Zaharin Yusoff: A synchronization structure of SSTC and its applications in machine translation; Coling-2002 workshop "Machine translation in Asia", 1 September 2002, Taipei,Taiwan; 8pp. [PDF, 272KB]

(2002) Michael Barlow: ParaConc: concordance software for multilingual parallel corpora. LREC-2002: Third International Conference on Language Resources and Evaluation. Workshop: Language resources for translation work and research, Las Palmas Canary Islands, 27 May 2002; pp.20-24. [PDF, 97KB]

(2002) Luisa Bentivogli & Emanuele Pianta: Opportunistic semantic tagging. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp. 1401-1406. [PDF, 61KB]

(2002) Thomas C.Chuang, G.N.You, & Jason Chang: Adaptive bilingual sentence alignment. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 21-30. [go to publisher details]

(2002) George Foster, Philippe Langlais, & Guy Lapalme: Text prediction with fuzzy alignment. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 44-53. [go to publisher details]

(2002) Ismael García Varea, Franz J. Och, Hermann Ney & Francisco Casacuberta: Improving alignment quality in statistical machine translation using context-dependent maximum entropy models. Coling 2002, Taipei, Taiwan, 26-30 August 2002 [PDF, 208KB]

(2002) Ismael Garcia Varea, Franz J.Och, Hermann Ney, & Francisco Casacuberta: Efficient integration of maximum entropy lexicon models within the training of statistical alignment models. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 54-63. [go to publisher details]

(2002) Mathieu Guidère: Toward corpus-based machine translation for standard Arabic. Translation Journal, Vol. 6, no.1, January 2002 [PDF, 173KB]

(2002) Kenji Imamura: Application of translation knowledge acquired by hierarchical phrase alignment for pattern-based MT. TMI-2002 conference, Keihanna, Japan, March 13-17, 2002. [PDF, 142KB]

(2002) Tz-Liang Kueng & Keh-Yih Su: A robust cross-style bilingual sentences alignment model. Coling 2002, Taipei, Taiwan, 26-30 August 2002 [PDF, 290KB]

(2002) Kenji Matsumoto & Hideki Tanaka: Automatic alignment of Japanese and English newspaper articles using an MT system and a bilingual company name dictionary. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.480-484. [PDF, 252KB]

(2002) Robert C. Moore: Fast and accurate sentence alignment of bilingual corpora. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 135-144. [go to publisher details]

(2002) R.Muñoz, R.Mitkov, M.Palomar, J.Peral, R.Evans, L.Moreno, C.Orasan, M.Saiz-Noeda, A.Ferrández, C.Barbu, P.Martínez-Barco, & A.Suárez: Bilingual alignment of anaphoric expressions. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.2088-2093. [PDF, 76KB]

(2002) Noah A.Smith: From words to corpora: recognizing translation. EMNLP-2002: Proceedings of the 2002 conference on Empirical Methods in Natural Language Processing, July 2002, Philadelphia, USA; pp.95-102 [PDF, 303KB]

(2002) Le Sun, Song Xue, Weimin Qu, Xiaofeng Wang, & Yufang Sun: Constructing a large-scale Chinese-English parallel corpus. Coling-2002: Third Workshop on Asian Language resources and International Standarization, 31 August 2002, Taipei,Taiwan; 8pp. [PDF, 359KB]

(2002) Taro Watanabe, Kenji Imamura and Eiichiro Sumita: Statistical machine translation based on hierarchical phrase alignment. TMI-2002 conference, Keihanna, Japan, March 13-17, 2002; pp.188-198. [PDF, 165KB]

(2001) Arul Menezes & Stephen D. Richardson: A best-first alignment algorithm for automatic extraction of transfer mappings from bilingual corpora. MT Summit VIII, Santiago de Compostela, Spain, 18-22 September 2001. Workshop on Example-Based Machine Translation. [PDF, 72KB]

(2001) Arul Menezes & Stephen D.Richardson: A best-first alignment algorithm for automatic extraction of transfer mappings from bilingual corpora. ACL-EACL 2001 workshop "Data-driven machine translation", July 7, 2001, Toulouse, France; pp.39-46. [PDF, 84KB]

(2001) Philip Resnik: [review of] Parallel text processing: alignment and use of translation corpora [ed. by] Jean Véronis. Computational Linguistics 27 (4), pp.592-595. [PDF, 333KB]

(2001) António Ribeiro, Gaël Dias, Gabriel Lopes & João Mexia: Cognates alignment. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.287-292. [PDF, 70KB]

(2001) Mark Stevenson: [review of] Jean Véronis: Parallel text processing (Kluwer). In: Machine Translation Review, issue 12: December 2001; pp.75-76.

(2000) Seonho Kim, Juntae Yoon, & Mansuk Song: Structural feature selection for English-Korean statistical machine translation Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 439-445 [PDF,.635KB]

(2000) Hiroshi Masuichi, Raymond Flournoy, Stefan Kaufmann, & Stanley Peters: A bootstrapping method for extracting bilingual text pairs Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 1066-1070 [PDF,.464KB]

(2000) Franz Joseph Och & Hermann Ney: Improved statistical alignment models. ACL-2000: 38th Annual meeting of the Association for Computational Linguistics, Hong Kong, October 2000. [PDF, 199KB]

(2000) Franz Josef Och & Hermann Ney: A comparison of alignment models for statistical machine translation Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 1086-1090 [PDF,.430KB]

(2000) António Ribeiro, Gabriel Lopes, & João Mexia: Using confidence bands for parallel texts alignment. ACL-2000: 38th Annual meeting of the Association for Computational Linguistics, Hong Kong, October 2000. [PDF, 260KB]

(2000) António Ribeiro, Gabriel Lopes, & João Mexia: A self-learning method of parallel texts alignment. Envisioning machine translation in the information future: 4th conference of the Association for Machine Translation in the Americas, AMTA 2000, Cuernavaca,Mexico, October 2000; ed. John S. White (Berlin: Springer Verlag, 2000); pp.30-39. [go to publisher details]

(2000) Sayori Shimohata: An empricial method for identifying and translating technical terminology Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 782-788 [PDF,.513KB]

(2000) Noah A. Smith & Michael E. Jahr: Cairo: an alignment visualization tool. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 549-551. [PDF, 45KB]

(2000) Ioannis Triantafyllou, Iason Demiros, Christos Malavazos, & Stelios Piperidis: An alignment architecture for translation memory bootstrapping. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 8pp. [PDF, 1579KB]

(2000) S.Vogel & H.Ney: Construction of a hierarchical translation memory Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 1131-1135 [PDF,.396KB]

(2000) KaorouYamamoto & Yuji Matsumoto: Acquisition of phrase-level bilingual correspondence using dependency structure Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 933-939 [PDF,.592KB]

Ambiguity preservation (see also Disambiguation)

(2000) Kevin Knight & Irene Langkilde: Preserving ambiguities in generation via automata intersection. 17th National conference of the American Association for Artificial Intelligence (AAAI 2000) July 30- August 3, 2000, Austin,Texas. [PDF, 149KB]

Analogies and analogical modelling (see also Example-based methods)

(2004) Yves Lepage: Lower and higher estimates of “true analogies” between sentences contained in a large multilingual corpus. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 999KB]

(2000) Christos Malavazos & Stelios Piperidis: Application of analogical modelling to example-based machine translation Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 516-522 [PDF,.665KB]

Analysis see Parsing; Semantic analysis; Syntactic analyis

Anaphora resolution

(2004) R. Florian, H.Hassan, A. Ittycheriah, H.Jing, K.Kambhatla, X.Luo, N.Nicolov, & S.Roukos: A statistical model for multilingual entity detection and tracking. HLT-NAACL 2004: Human Language Technology conference and North American Chapter of the Association for Computational Linguistics annual meeting, May 2-7, 2004, The Park Plaza Hotel, Boston, USA; pp.1-8. [PDF, 132KB]

(2004) Agnès Tutin, Meriam Haddara, Ruslan Mitkov, & Constantin Orasan: Annotation of anaphoric expressions in an aligned bilingual corpus.  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.267-270. [PDF, 524KB]

(2002) Teruko Mitamura, Eric Nyberg, Enrique Torrejon, Dave Svoboda, Annelen Brunner and Kathryn Baker: Pronominal anaphora resolution in the KANTOO multilingual machine translation system. TMI-2002 conference, Keihanna, Japan, March 13-17, 2002; pp. 115-129. [PDF, 112KB]

(2002) R.Muñoz, R.Mitkov, M.Palomar, J.Peral, R.Evans, L.Moreno, C.Orasan, M.Saiz-Noeda, A.Ferrández, C.Barbu, P.Martínez-Barco, & A.Suárez: Bilingual alignment of anaphoric expressions. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.2088-2093. [PDF, 76KB]

(2001) Teruko Mitamura, Eric Nyberg, Enrique Torrejon, David Svoboda & Kathryn Baker: Pronominal anaphora resolution in KANTOO English-to-Spanish machine translation system. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.235-239. [PDF, 171KB]

(2001) Shigeko Nariyama: Multiple argument ellipses resolution in Japanese. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.241-245. [PDF, 167KB]

(2000) Catalina Barbu & Ruslan Mitkov: Evaluation environment for anaphora resolution. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 8pp. [PDF, 1772KB]

(2000) David Farwell & Stephen Helmreich: An interlingual-based approach to reference resolution. NAACL-ANLP 2000 Workshop: Applied Interlinguas: Practical Applications of Interlingual Approaches to NLP, Seattle, May 2000; pp. 1-11 [PDF, 809KB]

(2000) R. Muñoz, M. Saiz-Noeda, A. Suárez, & M. Palomar: Semantic approach to bridging reference resolution. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 8pp. [PDF, 1750KB]

(2000) Jesús Peral & Antonio Ferrández: An application of the interlingua system ISS for Spanish-English pronominal anaphora generation. NAACL-ANLP 2000 Workshop: Applied Interlinguas: Practical Applications of Interlingual Approaches to NLP, Seattle, May 2000; pp. 42-51 [PDF, 733KB]

(2000) Maximiliano Saiz-Noeda, Manuel Palomar, & David Farwell: NLP system oriented to anaphora resolution. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 7pp. [PDF, 1564KB]

(2000) L. Sobha & B.N.Patnaik: VASISTH - an ellipsis resolution algorithm for Indian languages. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 5pp. [PDF, 987KB]

(2000) Hristo Tanev & Ruslan Mitkov: LINGUA – a robust architecture for text processing and anaphora resolution in Bulgarian. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 8pp. [PDF, 1780KB]

Annotation

(2004) Robert S.Belvin, Susanne Riehemann, & Kristin Precoda: A fine-grained evaluation method for speech-to-speech translation using concept annotations. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1427-1430. [PDF, 765KB]

(2004) Luisa Bentivogli, Pamela Forner, & Emanuele Pianta: Evaluating cross-language annotation transfer in the MultiSemCor corpus. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 66KB]

(2004) Martin Čmejrek, Jan Cuřin, Jiři Havelka, Jan Hajič, & Vladislav Kuboň: Prague Czech-English dependency treebank: syntactically annotated resources for machine translation . LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1597-1600. [PDF, 292KB]

(2004) Paul Kingsbury, Nianwen Xue, & Martha Palmer: Propbanking in parallel.  LREC-2004. Workshop, 25th May 2004: The amazing utility of parallel and comparable corpora; pp. 34-37. [PDF, 305KB]

(2004) Nadia Mana, Roldano Cattoni, Emanuele Pianta, Franca Rossi, Fabio Pianesi, & Susanne Burger: The Italian NESPOLE! Corpus: a multilingual database with interlingua annotation in tourism and medical domains. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1467-1470. [PDF, 753KB]

(2004) Florence Reeder, Bonnie Dorr, David Farwell, Nizar Habash, Stephen Helmreich, Eduard Hovy, Lori Levin, Teruko Mitamura, Keith Miller, Owen Rambow, & Advaith Siddharthan: Interlingual annotation for MT development. Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, September 28 – October 2, 2004; ed. Robert E.Frederking and Kathryn B.Taylor (Berlin: Springer Verlag, 2004); pp. 236-245. [go to publisher details]

(2004) Catarina Ribeiro, Ricardo Santos, Rui Pedro Chaves, & Palmira Marrafa: Semi-automatic UNL dictionary generation using WordNet.PT.  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.279-282. [PDF, 344KB]

(2004) Agnès Tutin, Meriam Haddara, Ruslan Mitkov, & Constantin Orasan: Annotation of anaphoric expressions in an aligned bilingual corpus.  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.267-270. [PDF, 524KB]

(2004) M. Vanni, C.R.Voss, & C. Tate: Ground truth, reference truth & “omniscient truth” – parallel phrases in parallel texts for MT evaluation.  LREC-2004. Workshop, 25th May 2004: The amazing utility of parallel and comparable corpora; pp. 10-13. [PDF, 352KB]

(2004) Fai Wong, Dong Cheng Hu, Yu Hang Mao, Ming Chui Dong: A flexible example annotation schema: translation corresponding tree representation. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 178KB]

(2004) Wong Fai, Hu Dong Cheng, Mao Yu Hang, Tang Chi Wai, & Dong Ming Chui: Application of translation corresponding tree (TCT) annotation schema in example-based machine translation.  LREC-2004. Workshop, 25th May 2004: The amazing utility of parallel and comparable corpora; pp. 42-45. [PDF, 474KB]

(2003) Hiroshi Kanayama & Hideo Watanabe: Multilingual translation via annotated hub language MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.202-207. [PDF, 167KB]

(2002) Mosleh H.Al-Adhaileh, Tang Enya Kong, & Zaharin Yusoff: A synchronization structure of SSTC and its applications in machine translation; Coling-2002 workshop "Machine translation in Asia", 1 September 2002, Taipei,Taiwan; 8pp. [PDF, 272KB]

(2002) Dien Dinh: Building a training corpus for word sense disambiguation in English-to-Vietnamese machine translation; Coling-2002 workshop "Machine translation in Asia", 1 September 2002, Taipei,Taiwan; 7pp. [PDF, 281KB]

(2002): Hideo Watanabe, Katashi Nagao, Michael C. McCord & Arendse Bernth: An annotation system for enhancing quality of natural language processing. Coling 2002, Taipei, Taiwan, 26-30 August 2002 [PDF, 202KB]

Applications of MT see index of applications

Artificial languages

(2001) Marcos Franco Sabarís, José Luis Rojas Alonso, C. Dafonte & B. Arcay: Multilingual authoring through an artificial language. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp. 99-102. [PDF, 138KB]

Aspect

(2003) Anna Kupść: Two approaches to aspect assignment in an English-Polish machine translation system 7th EAMT Workshop, "Improving machine translation through other language technology tools", 13 April 2003, Budapest, Hungary; pp. 17-24 [PDF, 297KB]

(2001) Masaki Murata, Kiyotaka Uchimoto, Qing Ma, & Hitoshi Isahara: Using a support-vector machine for Japanese-to-English translation of tense, aspect, and modality. ACL-EACL 2001 workshop "Data-driven machine translation", July 7, 2001, Toulouse, France; pp.111-119. [PDF, 230KB]

Bilingual corpora [see also Example-based methods, Multilingual corpora]

(2004) proceedings of  Workshop: The amazing utility of parallel and comparable corpora. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Lisbon, Portugal, 25 May 2004. [PDF, 2226KB]

(2004) Robert S.Belvin, Win May, Shrikanth Narayanan, Panayiotis Georgiou, & Shadi Ganjavi: Creation of a doctor-patient dialogue corpus using standardized patients. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.187-190. [PDF, 480KB]

(2004) Indrajit Bhattacharya, Lise Getoor, & Yoshua Bengio: Unsupervised sense disambiguation using bilingual probabilistic models.  ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the conference, 21-26 July 2004, Barcelona, Spain; pp. 287-294. [PDF, 166KB]

(2004) Michael Carl, Ecaterina Rascu, & Johann Haller: Using weighted abduction to align term variant translations in bilingual texts. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1973-1976. [PDF, 294KB]

(2004) Chen Benfeng & Pascale Fung: Automatic construction of an English-Chinese bilingual FrameNet.  HLT-NAACL 2004: Human Language Technology conference and North American Chapter of the Association for Computational Linguistics annual meeting, May 2-7, 2004, The Park Plaza Hotel, Boston, USA – Short Papers; pp. 29-32. [PDF, 185KB]

(2004) Luisa Bentivogli, Pamela Forner, & Emanuele Pianta: Evaluating cross-language annotation transfer in the MultiSemCor corpus. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 66KB]

(2004) Pascale Fung & Percy Cheung: Multi-level bootstrapping for extracting parallel sentences from a quasi-comparable corpus.  Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 245KB]

(2004) Pascale Fung & Percy Cheung: Mining very-non-parallel corpora: parallel sentence and lexicon extraction via bootstrapping and EM. EMNLP-2004: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 25-26 July 2004, Barcelona, Spain; 8pp. [PDF, 276KB]

(2004) E. Gaussier, J.-M.Renders, I.Matveeva, C.Goutte, & H.Déjean: A geometric view on bilingual lexicon extraction from comparable corpora.  ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the conference, 21-26 July 2004, Barcelona, Spain; pp.526-533. [PDF, 122KB]

(2004) Tamás Grőbler, Gábor Hodász, & Balázs Kis: MetaMorpho TM: a rule-based translation corpus. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.339-342. [PDF, 390KB]

 (2004) Hiroyuki Kaji: Adapted seed lexicon and combined bidirectional similarity measures for translation equivalent extraction from comparable corpora; TMI-2004: proceedings of the Tenth Conference on Theoretical and Methodological Issues in Machine Translation, October 4-6, 2004, Baltimore, Maryland, USA; pp.115-124. [PDF, 378KB]

(2004) Michael Kluck: Evaluation of cross-language information retrieval using the domain-specific GIRT data as parallel German-English corpus. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1343-1346. [PDF, 1533KB]

(2004) Jonas Kuhn: Experiments in parallel-text based grammar induction. ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the conference, 21-26 July 2004, Barcelona, Spain; pp.470-477. [PDF, 99KB]

(2004) Alon Lavie, Katharina Probst, Erik Peterson, Stephan Vogel, Lori Levin, Ariadna Font-Llitjos, & Jaime Carbonell: A trainable transfer-based MT approach for languages with limited resources 9th EAMT Workshop, "Broadening horizons of machine translation and its applications", 26-27 April 2004, Malta; pp. 116-123. [PDF, 265KB]

(2004) Hang Li & Cong Li: Word translation disambiguation using bilingual bootstrapping. Computational Linguistics 30 (1), pp. 1-22. [PDF, 2311KB]

(2004) Tracy Lin, Jian-Cheng Wu, & Jason S. Chang: Extraction of name and transliteration in monolingual and parallel corpora. Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, September 28 – October 2, 2004; ed. Robert E.Frederking and Kathryn B.Taylor (Berlin: Springer Verlag, 2004); pp. 177-186. [go to publisher details]

(2004) Dragos Stefan Munteanu, Alexander Fraser, & Daniel Marcu: Improved machine translation performance via parallel sentence extraction from comparable corpora.  HLT-NAACL 2004: Human Language Technology conference and North American Chapter of the Association for Computational Linguistics annual meeting, May 2-7, 2004, The Park Plaza Hotel, Boston, USA; pp. 265-272. [PDF, 1125KB]

(2004) Francisco Nevado, Francisco Casacuberta, & Josu Landa: Translation memories enrichment by statistical bilingual segmentation. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.335-338. [PDF, 354KB]

(2004) Chris Pike & I.Dan Melamed: An automatic filter for non-parallel texts. ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the interactive poster and demonstration sessions, 21-26 July 2004, Barcelona, Spain; 4pp. [PDF, 128KB]

(2004) Monica Rogati & Yiming Yang: Customizing parallel corpora at the document level. ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the interactive poster and demonstration sessions, 21-26 July 2004, Barcelona, Spain; 4pp. [PDF, 80KB]

(2004) Li Shao & Hwee Tou Ng: Mining new word translations from comparable corpora. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 305KB]

(2004) Dan Tufiş, Radu Ion, & Nancy Ide: Fine-grained word sense disambiguation based on parallel corpora, word alignment, word clustering and aligned wordnets. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 192KB]

(2004) Agnès Tutin, Meriam Haddara, Ruslan Mitkov, & Constantin Orasan: Annotation of anaphoric expressions in an aligned bilingual corpus.  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.267-270. [PDF, 524KB]

(2004) Jui-Feng Yeh, Chung-Hsien Wu, Ming-Jun Chen, & Liang-Chih Yu: Automated alignment and extraction of bilingual ontology for cross-language domain-specific applications. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 196KB]

(2003) Eiji Aramaki, Sadao Kurohashi, Hideki Kashioka, & Hideki Tanaka: Word selection for EBMT based on monolingual similarity and translation confidence HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 1565KB]

(2003) Chris Callison-Burch & Miles Osborne: Bootstrapping parallel corpora HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 83KB]

(2003) Katri A. Clodfelder: An LSA implementation against parallel texts in French and English HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 119KB]

(2003) Dinh Dien & Hoang Kiem: POS-tagger for English Vietnamese bilingual corpus HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 202KB]

(2003) Yuan Ding, Daniel Gildea, & Martha Palmer: An algorithm for word-level alignment of parallel dependency trees MT Summit IX, New Orleans, USA, 23-27 September 2003 [PDF, 242KB]

(2003) Takao Doi, Eiichiro Sumita, & Hirofumi Yamamoto: Adaptation using out-of-domain corpus within EBMT HLT-NAACL 2003: conference combining Human Language Technology conference series and the North American Chapter of the Association for Computational Linguistics conference series,  May 27 – June 1,  2003, Edmonton, Canada; 3pp. [PDF, 33KB]

(2003) Fei Huang, Stephan Vogel, & Alex Waibel: Automatic extraction of named entity translingual equivalence based on multi-feature cost minimization ACL-2003 Workshop on Multilingual and Mixed-language Named Entity Recognition, July 12, 2003, Sapporo, Japan; 8pp.. [PDF, 262KB]

(2003) Kenji Imamura, Eiichiro Sumita, & Yuji Matsumoto: Feedback cleaning of machine translation rules using automatic evaluation ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 62KB]

(2003) Kenji Imamura, Eiichiro Sumita & Yuji Matsumoto: Automatic construction of machine translation knowledge using translation literalness. EACL 2003: 10th Conference of the European Chapter of the Association for Computational Linguistics, April 12-17, 2003, Budapest, Hungary. Proceedings; pp.155-162 [PDF, 397KB]

(2003) Tadashi Kumano, Hideki Kashioka, Hideki Tanaka, & Takahiro Fukusima: Construction and analysis of Japanese-English broadcast news corpus with named entity tags ACL-2003 Workshop on Multilingual and Mixed-language Named Entity Recognition, July 12, 2003, Sapporo, Japan; 8pp.. [PDF, 56KB]

(2003) Qing Ma, Yujie Zhang, Masaki Murata, & Hitoshi Isahara: Semantic maps for word alignment in bilingual parallel corpora  ACL-2003: Second SIGHAN Workshop on Chinese Language Processing, July 11-12, 2003, Sapporo, Japan; 6pp.. [PDF, 199KB]

(2003) Sara Laviosa: Corpora and the translator. In: Harold Somers (ed.) Computers and translation: a translator’s guide (Amsterdam/Philadelphia: John Benjamins Publishing Company, 2003); pp.105-117.

(2003) Elliott Macklovitch: On the pleasure of being bi-textual; or My life in parallel text. HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF from PPT, 301KB]

(2003) Robert C.Moore: Learning translations of named-entity phrases from parallel corpora. EACL 2003: 10th Conference of the European Chapter of the Association for Computational Linguistics, April 12-17, 2003, Budapest, Hungary. Proceedings; pp.259-266 [PDF, 377KB]

(2003) Joel Martin, Howard Johnson, Benoit Farley, and Anna Maclachlan: Aligning and using an English-Inuktitut parallel corpus HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 30KB]

(2003) Francisco Nevado, Francisco Casacuberta, & Enrique Vidal: Parallel corpora segmentation using anchor words. 7th EAMT Workshop, "Improving machine translation through other language technology tools", 13 April 2003, Budapest, Hungary; pp. 33-40 [PDF, 382KB]

(2003) Hwee Tou Ng, Bin Wang, & Yee Seng Chan: Exploiting parallel texts for word sense disambiguation: an empirical study ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 364KB]

(2003) Stephen Nightingale & Hideki Tanaka: Comparing the sentence alignment yield from two news corpora using a dictionary-based alignment system HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 142KB]

(2003) Daniel Ortíz, Ismael García-Varea, Francisco Casacuberta, Antonio Lagarda, & Jorge González: On the use of statistical machine-translation techniques within a memory-based translation system (AMETRA) MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.299-306. [PDF, 88KB]

(2003) Katharina Probst: Using ‘smart’ bilingual projection to feature-tag a monolingual dictionary HLT-NAACL 2003: proceedings of Seventh Conference on Natural Language Learning,  May 27 – June 1,  2003, Edmonton, Canada; 8pp. [PDF, 108KB]

(2003) Philip Resnik & Noah A. Smith: The web as a parallel corpus. Computational Linguistics 29 (3), pp.349-380. [PDF, 8130KB]

(2003) Fatiha Sadat, Masatoshi Yoshikawa, & Shunsuke Uemura: Bilingual terminology acquisition from comparable corpora and phrasal translation to cross-language information retrieval ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 61KB]

(2003) Fatiha Sadat, Masatoshi Yoshikawa, & Shunsuke Uemura: Learning bilingual translations from comparable corpora to cross-language information retrieval: hybrid ststistics-based and linguistics-based approach IRAL 2003: Sixth International Workshop on Information Retrieval with Asian Languages,  July 7,  2003, Sapporo, Japan; 8pp. [PDF, 122KB]

(2003) Lee Schwartz, Takako Aikawa, & Chris Quirk: Disambiguation of English PP attachment using multilingual aligned data MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.330-337. [PDF, 98KB]

(2003) Takehito Utsuro, Takashi Horiuchi, Kohei Hino, Takeshi Hamamoto & Takeaki Nakayama: Effect of cross-language IR in bilingual lexical acquisition from comparable corpora. EACL 2003: 10th Conference of the European Chapter of the Association for Computational Linguistics, April 12-17, 2003, Budapest, Hungary. Proceedings; pp.355-362 [PDF, 734KB]

(2003) Stephan Vogel: Using noisy bilingual data for statistical machine translation. EACL 2003: 10th Conference of the European Chapter of the Association for Computational Linguistics, April 12-17, 2003, Budapest, Hungary. Proceedings; pp.175-178 [PDF, 194KB]

(2003) Hua Wu & Ming Zhou: Synonymous collocation extraction using translation in