Machine Translation Archive
(www.MT-Archive.info)
Introduction and guide to usage
The aim
of this electronic archive is to provide a permanent on-line location for a
comprehensive collection of articles, books and papers in the field of machine
translation and computer-based translation technology. The primary aim of the
archive is comprehensive coverage of publications which are difficult to find
or obtain from the usual sources; this is true particularly for the proceedings
of conferences which have not been published by well-known commercial book
publishers. For the sake of completeness, the archive also includes index entries
for publications which for copyright reasons cannot be located on this site.
Coverage. The archive contains only
English-language publications – although, in due course, some publications in
other languages may be included selectively (a preliminary index of some
French-language publications is now available). The
archive covers publications on all aspects of machine translation and
computer-assisted translation, translation memories, and translation tools; it
includes also publications in related areas of interest to researchers in the
field, such as controlled languages, cross-language information retrieval,
information extraction, multilingual resources, terminology, etc. The
proceedings of conferences devoted to machine translation (and
computer-assisted translation) are being covered in full; for other conferences,
papers are being included selectively (for details see below.)
The
ultimate aim is completeness. In the first instance an effort will be made to
cover comprehensively publications since 1990. The next priorities are
publications from the mid 1970s to the late 1980s and then selectively from the
earliest years of MT in the 1950s and 1960s. The goal of comprehensiveness
means that some papers included are known to be inaccurate, misleading or
ill-informed – particularly some articles from popular magazines and the
internet. Caveat lector!
The
Machine Translation Archive does not include information about current
commercial systems (except when described in papers). For such information see
the Compendium of translation software
on the EAMT website (http://www.eamt.org/soft_comp.php).
Copyright. All publications are the copyright of
authors (except where the copyright is held by a publisher). In general, any
material may be used and copied for teaching and research purposes. Permission is given under a Creative Commons
Attribution-NonCommercial-ShareAlike 3.0 License. Permission to download is
not
given, therefore, to any individuals or organizations which charge or
intend to charge users (by fees or by subscriptions) for materials on their
databases.
Citations. Every effort has been made
to ensure the correctness and completeness of the bibliographical details. When
citing articles it is recommended that these full details are given - plus, if
and where appropriate, this source (http://www.mt-archive.info) and the file
name. File names consist generally of an abbreviation for the conference name, for
organizations holding the conference, or for the journal title, followed by the
year and the name of the first author. For example ‘http://www.mt-archive.info/AMTA-1994-White’
refers to a paper given by John White at the 1994 conference of the Association
for Machine Translation in the
Format. Publications are provided
in PDF format (sometimes converted from PostScript or PowerPoint). As far as
possible, publications have been scanned (and checked by the compiler for typographical
mistakes) from the original hard copies. However, the reproduction and
legibility of some PDF files are sometimes poor. In due course some of these
will be re-scanned.
Indexes. All publications are
listed in six indexes. In the index of authors they are entered under the names of all authors (in as full forms as can be
ascertained). In the index of organizations, they are under the names of the
institutions and organizations with which authors are affiliated and/or where
or for which the research is undertaken. Institutions, organizations and
companies are grouped by country. In the index of systems, they are entered under the names
of projects or systems which are mentioned. Other indexes are those of languages
and language pairs treated in publications, of methods, techniques, and other computational and linguistic topics,
and of applications and other issues
affecting the use of systems.
Indexes
for institutions/organizations,
for applications,
and for methods/techniques/etc.
are divided into appropriate time periods (currently: 2010 to the present, 2005-2009,
2000-2004, 1990-1999, and pre-1990).
The
index of authors is divided alphabetically;
note that names beginning Mc are ‘spelled
out’ as Mac, and that diacritics are ignored (i.e. ü is filed as if u, ø as if o, å
as if a, č as if c, etc.) Indexes for languages and for systems
are also divided alphabetically.
Note
that in all the indexes the publications under a heading or name are listed in reverse
chronological order.
Conferences and journals. For those
conference proceedings which are included complete in the Archive, users will
find tables of contents accessed from the index of conferences. In a second index users can find a list of conferences from which
articles have been selectively included. A third index
lists conferences, whether devoted wholly or partly to machine translation, which
have yet to be included in the Archive. There are also tables of contents for
those journals which are included so far; these are accessed from the index of journals.
Web sites. Links to the personal websites
of individual researchers may be useful for tracing publications which have not
(yet) been included in the Archive. No assurance can be given that these links
will remain up-to-date; the compiler welcomes the assistance of any collaborator and news of any changes and
suggestions for additions.