konferencia fordítás konceptualizáció kognitív nyelvi világkép filológia szakdolgozat nyelvészet metafora imperative language patterns phraseology representation death linguistic worldview infinitive szecesszió grammatika főnévi igenév kettős állítmány stilizáció stíluselemzés dekorativitás decorativity nyelvi példa klasszikus modernség eufemizmus euphemistic phrases alcoholic drink irodalmi vita theory of criticism the betrayal of the intellectuals az írástudók árulása recenzió irodalomtörténet recepciótörténet irodalompolitika hungarian literature modernity classical modernism magyar irodalom modernitás cognitive analysis prevalence inflection frazéma frazeológia antonima antonímia horvát nyelv rikkancs bulvársajtó media kognitív metafora kognitív nyelvészet linguistic image of the world cognitive linguistics nyelvoktatás didaktika pszicholingvisztika tartalomelemzés croatian language antonymy irodalmi társaságok modern magyar irodalom ady endre vörösmarty díj literary organizations modern hungarian literature vörösmarty prize irodalomtudomány újraszerkesztés újrafordítás polish language metaphor szaknyelv fordítástudomány retranslation phraseme facebook proto slavic perspective ekvivalencia hatalom ideológia deontikus modalitás sajtótörténet history of press nyelvtörténet dialektológia nyelvi kép világ_nyelvi_képe szemantika nyelvi_kép conceptualization ideology magyar_nyelv nyelvhasználat faktitív ige hungarian language equivalence focalization irony családregény nézőpont fokalizáció gyermekelbeszélő irónia vonzatkutatás szövegnyelvészet nyelvjárás szerb translation tudománytörténet critical_discourse_analysis factitive verb causativity tanulmánykötet optimalizáció zenitism szabadverselés szabadstrófa horvát expresszionizmus zenitizmus önéletrajz performativitás autobiography identity performativity criseology szláv lexika orosz nyelv ősszláv historical linguistics slavic vocabulary avantgárd expresszionizmus horvát filológia nyelvhelyesség funkcióige eszmetörténet narratológia diskurzuselemzés narratology discourse analysis modern_filológiai_társaság krízeológia önéletírás életrajz asszociáció analógia szimmetria nyelvi intuíció nyelvi formalizáció russian language

Lynne Bowker: Computer Aided Translation Technology: A Practical Introduction

Ottawa: University of Ottawa Press. 2002, 185 pp. ISBN 0-7766-0538-0


Mátyás Bánhegyi


[A teljes cikk letölthető formátumban]

Some deem it a sad fact, whereas others welcome the new development that translation is getting increasingly computerised and computer aided. Like it or not, computerised translation tools are inevitably involved in almost all fields of translation. While computerisation may result in ready-made, prefabricated texts appearing in several target texts, it may also largely contribute to the facilitation of the job of translators. In her book entitled Computer Aided Translation Technology: A Practical Introduction, Lynne Bowker, assistant professor in translation at the University of Ottawa and the author of numerous books and articles on corpora, computer assisted translation technology and terminology, sets out to explore computer aided translation technology in depth and to describe the ways such technology can help the translation process and the daily work of translators.


In altogether seven chapters, the volume offers a practical guide to computer aided translation technology currently available in the global market. After the introductory chapter (Chapter 0), which specifies the purpose and the audience and describes the contents of the book, Chapters 1-6 detail six different aspects of computer technology used in translation. Chapter 1 entitled Why Do Translators Need to Learn about Technology? justifies the use of computers in the translation process, Chapter 2 Capturing Data in Electronic Form discusses the methods and processes of digitalising data necessary for the translation process. Chapter 3 entitled Corpora and Corpus-Analysis Tools describes in what ways and for what purposes corpora can be used to aid the translation process, while Chapter 4, Terminology-Management Systems, explains how terminology can be safely retrieved and managed with the help of computer applications. Chapter 5, Translation-Memory Systems, presents the use of translation memories in the translation process, and Chapter 6, Other New Technologies and Emerging Trends, discusses the effects computerisation has or is likely to have on translation. Below, Chapters 1-6 are described in more detail.

Chapter 1 presents the global translation market. It explains the use of computerised technologies with the fact that increasing volumes of translation are requested to be done within ever decreasing periods of time. Exploring the literature, Bowker notes that translators and translation companies are not as knowledgeable about computer technologies as they could and should be: this might well hinder the competitiveness of freelance translators and companies alike as the practical skills associated with computer technologies are completely essential in the translation business. On the other hand, Bowker notes that, from the point of view translation research, computer aided translation tools yield data for empirical translation research, such as the analysis of the impacts of technology on translation practice and pedagogy. Therefore, the computerisation of translation may potentially contribute to the emergence of new prolific research areas in Translation Studies.

Chapter 2 demonstrates a translation commission where the document to be translated is not in an electronically readable format, which requires that it be either scanned by a scanning and optical-character recognition software or be read out loud and recognised by a voice recognition software for further processing in a machine-readable form (e.g. translating the text using a word processor). The chapter describes working with both the optical-character recognition and the voice recognition software in detail and outlines tips, benefits and disadvantages including body part strain, time, quality and economic aspects associated with the use of these two software applications. As the data generated by optical-character recognition or voice recognition software are not compatible with all software applications commonly used by translators, they often have to use filters and file converters to make the data generated this way usable for other applications. Nevertheless, during file conversion some of the data, the formatting of the document or other information may be lost. The most frequent of these problems are listed in the book in order to familiarise users with the potential errors and to enable them combat these problems through careful document editing prior to starting the actual translation of these documents.

Chapter 3 reviews the different corpora used in the translation business and working with corpora. Bowker differentiates between two corpora: printed corpora (i.e. a collection of paper-based documents) and electronic corpora. The latter is subdivided into monolingual, monolingual comparable, aligned bilingual parallel and non-aligned bilingual comparable corpora. All of these corpora can be processed with the help of electronic corpus analysis tools and can be classified as raw (with no annotation) or annotated corpora (with additional information about the words in the corpus). In what follows, corpus analysis tools and working with annotation will be presented in more detail.

Through the use of corpus analysis tools, the information contained in the corpus can be accessed, manipulated and displayed. Such tools include the production of word-frequency lists, the use of concordancers and collocation generators. Word frequency shows how many different words there are in a given corpus and how often each word appears. To facilitate working with corpora, word frequency lists can be sorted in order of appearance, alphabetical order or the order of frequency of the words appearing in the corpus. Apart from simple lists of words, lemmatised lists and stop lists can also be produced with software applications. Two useful options in producing lists are obtaining lemmatised and stop lists. Lemmatised queries provide lists in which related word forms (such as translate, translates, translated, etc.) are grouped together and the number of the different word forms are also counted, whereas stop lists are lists in which certain words (e.g. grammatical words such as is, are, etc.) are ignored in the lists produced by the software. These diverse lists allow for different uses by translators, speeding up or increasing the quality of target texts.

Also used with corpora, concordancers provide all occurrences of a search pattern (e.g. the search string the text) and its immediate context. Concordancers can be of two types: monolingual and bilingual. This means that the string in question can be searched for in either the source or the target language text or, if the source and the target texts have been aligned previously, the string searched for in either the source or the target texts will be displayed along with its corresponding source or target language realisation, respectively. This provides translators with valuable search results. Another practical tool with corpora, the collocation generator, on the other hand, determines if two words collocate by comparing the actual co-occurrence patterns of pairs of words against the patterns that would have been expected if the two words were randomly distributed throughout the text. The software can even find collocations whose contributing words do not appear side by side but are separated by intervening words. This search may improve the style and accuracy of translations.

Linguistic annotation is another advantage of working with corpora. Linguistic annotation can be of two kinds: syntactic and semantic annotation. The former can define the part of speech and/or other grammatical qualities of words (singular, plural, etc.), while semantic annotation is used for distinguishing between different meanings of words. Both of these options allow for relatively narrow and specific searches in the corpora suited to the needs of the translator. Apart from linguistic annotation, non-linguistic annotation (e.g. categorising parts of texts as title, subtitle, headlines, etc.) can also be processed with these applications, which enable non-linguistic searches in the corpora, such as finding all headlines containing the word e.g. railroad, etc.

Even if building corpora demands a lot of hard work and careful consideration including assessing the availability of, and copyright issues associated with, the texts to be included, the effort pays off as translators can thus have access to information with the help of more accurate and reliable text construction and terminology management becomes possible.

The next chapter of the book, Chapter 4 deals with terminology management systems, which provide a reliable and flexible electronic bank for storing, consolidating and retrieving terminology-related information. These applications host terminological information in a concept-oriented way, allowing for terminology management in a number of translation directions. Such software does not only facilitate consistency in the use of terminology but also enables users to define and format their own data fields thereby increasing translators’ efficiency.

As for the most common services offered by terminology management systems, Bowker refers to wildcard search, fuzzy match, active term recognition, pre-translation and term extraction. These functions are capable of completing a number of tasks. Wildcard search allows the user to find words which consist of certain specified characters and certain non-specific ones such as the search string char*, where the asterisk represents any character or string of characters. Fuzzy match search turns out similar but not necessarily exact matches of items requested by the user. Such near matches are capable of looking up morphological variants of words (different word forms) or words with similar spellings.

Through active term recognition, terms in the source text are identified and are compared against the contents of an already existing term base. If a match is found, the target language term is offered for the translator for inclusion in the target text. Pre-translation, on the other hand, is similar to active term recognition in a way that it automatically identifies terms in the source text and inserts their equivalents in the target text thereby creating a hybrid texts made up of terms in the target language and the rest of the text in the source language. The translator post edits the hybrid text thus producing the translation.

The last service of terminology management systems discussed in the book, term extraction or term recognition, is a monolingual tool that analyses the source text in order to identify possible technical terms of one or more words long with a view to facilitating the creation of a term base by the translator. This service works either on a linguistic or on a statistical basis: it either looks up certain linguistic constructions likely to be terms or looks up strings that, due to their repeated occurrence in the text, are likely to be technical terms. It must, at the same time, be noted that terminology management systems are often used in computer networks to share data between the client and the translator or several translators. The book also outlines the advantages and drawbacks of the different technical solutions applied for sharing terminology management system data and offers possible solutions to these problems.

Chapter 5 expounds on translation memories, which align source and target texts and store the aligned segments (usually source and target language sentences) in a database from which information is retrievable. A translation memory compares a currently translated source text segment against the translation memory database and, if a match is found, it offers a translation of the segment in question. It follows from this that translation memories allow translators to reuse previously aligned segments and to obtain information about such segments and/or parts of these segments.

In fact, in translation memories, segment matching works at different levels and provides exact matches, full matches, fuzzy matches, term matches and sub-segment matches as results. An exact or perfect match is generated if a segment stored in a translation memory is 100% identical with the currently translated segment. The translator can either reuse the target text offered by the translation memory without effecting any changes or can alter the target text according to the actual needs. In the case of a full match, a segment differs from the segment stored in the translation memory only in variable elements, also called placeables or named entities (i.e. numbers, dates, times, currencies, measurements and proper names). Similarly, a fuzzy match is produced by the translation memory if the segment in the translation memory is similar to the currently translated segment but there is no 100% similarity. In this case, the differences between the segments in question are highlighted on the screen. How similar the two segments have to be for the translation memory to give a match can be adjusted by the user.

Another type of segment matching, term match is the matching of a word or a phrase with the information stored in the term base of the translation memory. The term match enables translators to more accurately manage terminology. Lastly, in the case of a sub-segment match, matching of a string of text is compared below the level of a segment: at the level of chunks (phrases, clauses, etc.). This newly introduced option may provide translators with useful suggestions as to better quality target texts. It is very important to note, nonetheless, that matching algorithms used to turn out the above kinds of matches are not without their shortcomings, which always necessitates human intervention for producing the final version of the target language text.

Another factor to consider when using a translation memory is the fact that too big a translation memory may offer several suggestions as to the translation of segments but can slow down work. A too small translation memory, at the same time, may not be able to turn out a lot of matches but works considerably faster. Users of translation memories have to make decisions concerning the size of their translation memories to most effectively manage their translation commissions.

In fact, translation memories are created by users. There are two ways of creating translation memories with the help of translation software: by way of interactive translation (building the translation memory while working on a translation assignment) and by post-translation alignment (a previously translated target text is aligned with its source text sentence by sentence, usually automatically, with minimal but usually necessary human intervention). Both of these ways of building a translation memory have their advantages and drawbacks to consider by the (future) user of the translation memory. A translation memory created through interactive translation is of higher quality, is more reliable but takes longer time to produce, whereas post-translation alignment produces a lower quality and less reliable translation memory, which needs to be carefully reviewed by future users for accuracy.

Another factor to consider in connection with the potential use of translation memories is that not all texts are suitable for processing with translation memories. Texts that are suitable for such processing are ones that contain internal repetitions (i.e. repetitions within the text to be translated), are revisions of previous texts (i.e. amended versions of earlier texts) or updates (i.e. the client makes changes to the text while the translation is being done). On the other hand, the use of translation memories, through their function of standardising target texts, could also be advantageous if the translator works for the same client, who has express wishes concerning the terminology to be used and has stylistic preferences.

Finally, Chapter 5 discusses the benefits and drawbacks of using translation memories concerning the time invested, the quality of translation work produced, the electronic management of files and payment-related issues. The chapter also raises the theme of using other computer software in conjunction with translation memories and discusses translators’ economic considerations of purchasing a translation memory.

Chapter 6 briefly addresses the future of translation software and claims that translation is in fact increasingly seen as one part of the document-production process rather than being treated as a separate activity. This offers advantages for the translator and the client alike. One marked benefit could be the translator’s and the document producer’s closer cooperation on writing standards and guidelines concerning text production, which can greatly improve the quality of the resulting texts.

In her closing remarks, Bowker predicts that new types of translation commission (e.g. web page translation, etc.) will prompt new developments in translation technology (e.g. software that is capable of separating the text to be translated from tags in HTML format). These and other future or current developments are likely to include the integration of several translation tools into one software and such software will most probably support more languages and file formats than in the past. It should not be forgotten, nonetheless, that users (i.e. practising translators and translator trainees) must be properly trained to be able to use computer aided translation tools and that software developers should ideally work hand in hand with translator training institutions so that newer releases of translation software are capable of offering improved services, which can meet the challenges posed by contemporary translation commissions. Translation software development and the pedagogical issues related to it are a new research direction, which, in our understanding, has not yet been given the status it deserves in Translation Studies.

In conclusion, the volume described above provides an extensive, easy-to-follow and practical introduction to present-day computer technology used in translation. The book is a real eye-opener for both beginner and expert users of computer aided translation technology and translation educators alike. It comes with a complete list of further readings at the end of each chapter, a glossary, a list of commercially available computer aided translation tools and a list of references.