Working with Corpora in Translation Technology Teaching: Enhancing Aspects of Course Design

Mark Shuttleworth*

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingChapterpeer-review


This chapter offers a discussion of two linked areas that are relevant to the translation technology curriculum. Firstly, in the absence of a suitable cost-effective industry-standard terminology extraction utility—as well as the relative under-representation of this feature in most CAT tools—the Sketch Engine lexical analysis suite of tools has provided our students with a reasonably successful application to use for this purpose. Sketch Engine permits users to create monolingual lists of single- and multi-word candidate terminological items on the basis of ad hoc corpora created from texts that need to be translated, these lists then providing students with a solid starting point from which to create detailed bilingual terminology lists, while a bilingual term extraction function is also provided. Secondly, aware that many CAT tool functions work best when large translation memories are available, but also mindful of the fact that students generally have to start with TMs that are completely empty, we use sources such as the OPUS website as a resource for downloading large corpora to facilitate the creation of very sizable translation memories with great rapidity. In this way, students are able to obtain a much more positive impression of a CAT tool’s concordance tool, for example, and also potentially gain confidence in the tool’s capacity to provide useful fuzzy matches in general.
Original languageEnglish
Title of host publicationCorpora and Translation Education
Subtitle of host publicationAdvances and Challenges
EditorsJun Pan, Sara Laviosa
Place of PublicationSingapore
Number of pages25
ISBN (Electronic)9789819965892
ISBN (Print)9789819965885, 9789819965915
Publication statusPublished - 22 Dec 2023

Publication series

NameNew Frontiers in Translation Studies
ISSN (Print)2197-8689
ISSN (Electronic)2197-8697

User-Defined Keywords

  • Translation technology teaching
  • Term extraction
  • Parallel data acquisition
  • Memsource
  • Phrase
  • Sketch Engine
  • OPUS open source parallel corpus
  • AI


Dive into the research topics of 'Working with Corpora in Translation Technology Teaching: Enhancing Aspects of Course Design'. Together they form a unique fingerprint.

Cite this