Machine Translation Archive
(www.MT-Archive.info)
Introduction and guide to usage
[to return to home page
click here]
This electronic
repository includes copies of articles, books and papers in the field of machine
translation and computer-based translation technology. In a few cases
publications are accessed by links to other sites. The archive includes also
index entries for publications which for copyright reasons cannot be located on
this site.
Coverage. The archive contains only
English-language publications – although, in due course, some publications in
other languages may be included selectively. It covers publications on all
aspects of machine translation and computer-assisted translation, translation
memories, and translation tools; it includes also publications in related areas
of interest to researchers in the field, such as controlled languages,
cross-language information retrieval, information extraction, multilingual
resources, terminology, etc. The proceedings of conferences devoted to machine
translation (and computer-assisted translation) are being covered in full; for
other conferences, papers are being included selectively.
The ultimate aim is
completeness. In the first instance an effort will be made to cover
comprehensively publications since 2000. The next priorities are publications
from the mid 1980s to the late 1990s and then selectively from the earliest
years of MT in the 1950s and 1960s. The goal of
comprehensiveness means that some papers included are known to be inaccurate,
misleading or ill-informed – particularly some articles from popular magazines
and the internet. Caveat lector!
The Machine
Translation Archive does not include information about current commercial
systems (except when described in papers). For such information see the Compendium of translation software (http://www.hutchinsweb.me.uk/Compendium.htm).
Copyright. All publications are the copyright of authors
(except where the copyright is held by a publisher). In general, any material
may be used and copied for teaching and research purposes, and no material may
be used for commercial purposes without permission of authors. Items copied from the ACL Anthology are copyright of
the Association for Computational Linguistics and subject to a Creative Commons
Attribution-NonCommercial-ShareAlike 2.5 Licence.
Citations. Every effort has been made
to ensure the correctness and completeness of the bibliographical details. When
citing articles it is recommended that these full details are given - plus, if
and when appropriate, this source (http://www.mt-archive.info) and the file
name. File names consist generally of an abbreviation for the conference name, for
organizations holding the conference, or for the journal title, followed by the
year and the name of the first author. For example ‘http://www.mt-archive.info/AMTA-1994-White’
refers to a paper given by John White at the 1994 conference of the Association
for Machine Translation in the
Format. Most publications are
provided in PDF format (sometimes converted from PostScript or PowerPoint); a
few are in HTML format. As far as possible, publications have been scanned (and
checked for typographical mistakes) from original hard copies. However, the
reproduction and legibility of some PDF files taken from other websites are
sometimes poor. In due course some of these will be re-scanned.
Indexes. All publications are listed in six indexes. In the
index of authors they are entered under
the names of all authors (in as full
forms as can be ascertained). In the index of organizations, they are under the names of the institutions and
organizations with which authors are affiliated and/or where or for which the
research is undertaken. In the index of systems,
they are entered under the names of projects or systems which are mentioned.
Other indexes are those of languages
and language pairs treated in publications, of methods, techniques, and other computational and linguistic topics,
and of applications and other issues
affecting the use of systems.
Indexes for institutions/organizations, for applications, and for methods/techniques/etc. are divided into
appropriate time periods (currently: 2000 to the present, 1990-1999, and
pre-1990), with future divisions into 5 and 10 year periods as necessary.
The index of authors is divided alphabetically
(with subdivisions introduced as necessary); note that names beginning Mc are ‘spelled out’ as Mac, and diacritics are ignored (i.e. ü
is filed as if u, ø as if o, å as if a, č
as if c, etc.) Indexes for languages and for systems
are also divided alphabetically.
Note that in all
the indexes the publications under a heading or name are listed in reverse
chronological order.
Conferences and journals. For those conference
proceedings which are included complete in the Archive, users will find tables
of contents accessed from the index of conferences.
From this index users will also be able to find a list of conferences devoted
wholly or partly to machine translation, whether or not the proceedings have
yet been included in the Archive. There are also tables of contents for those
journals which are included so far; these are accessed from the index of journals.
Web sites. Links to the personal
websites of individual researchers may be useful for tracing publications
which have not (yet) been included in the Archive. No assurance can be given
that these links will remain constant; the compiler welcomes news of any
changes and suggestions for additions. The same caveat applies to the links for
individual publications.