The basic structure of our system for individual online translators is illustrated in the following figure:
QRedit is an integrated translation editor environment. It is implemented in JAVA + TOMCAT, and translators access the system through a Web browser. BEYTrans (not illustrated here), an environment for community-oriented translation projects, is implemented in XWiki. Functional modules such as QRselect (recycling relevant existing translation document pairs), Maitake (compilation of proper name dictionaries), QRcep (a look-up function of terms in use in context from Web documents), QRidiom (flexible automatic look-up function of idioms with variations), etc. are called from the QRedit and BEYTrans environments. In the following, however, we show the images of these modules independently, just for the sake of illustration. QRidiom is illustrated through the QRedit environment.
QRedit has a simple interface that consists of a source document area and a target document editing area. Buttons and menu bars have been omitted in order to maximise the size of the working area. This design is similar to the TransType system developed in Canada, and quite different from most of the commercial translation aid systems.
Users either specify the URL of the document that is to be translated or copy and paste the document into the source document area. They then press the "GO" button, which activates all the reference look-up functions.
Some of the QRedit's characteristics are as follows:
- Keyboard control always remains in the target document editing area, so that the translator can concentrate on making translations.
- The source document area and the target document area scroll synchronically, paragraph by paragraph.
- All the look-up functions are activated by on the mouse pointing and clicking action on source document language unit tokens in the source document area.
- Reference information can be pasted into the target document editing area with a single mouse clik./li>
- A stratified notification interface alerts translators to the existence of reference information that they can look up.
Most of the functions are capculised into the small pop-up window that appears when the user clicks the relevant source language unit tokens. This is to avoid interrupting translators' work rhythm when they are creating target documents. For now, we have not integrated management functions such as making memos or personal terminological lists, which are provided by commercial translation aid systems.
Currently several translators are using and evaluating QRedit. They are using the system not only to translate online documents, but also to translate full-length books which are to be published by commercial publishers.
The functional requirements for a system that aids community-oriented translation projects are different from a system that aids individual translators. What is of utmost importance in the former case is to maintain the consistency of the overall translation produced by a translation community, with as little expense of energy by individual translators as possible. It is thus necessary to find the point of convergence between the benefit for individual translators and the benefit for the translation community as a whole. Because of this, the BEYTrans interface is more complex than the QRedit interface, and is similar in overall appearance to most commercial translation aid systems.
QRselect collects and recycles "online translation archives," sets of translation document pairs that are relevant to and related to documents that translators are working on. As such, it blongs to the area commonly called "translation memory" in the field of NLP. In the Shiitake project, however, we conceptually divide what is technically called "translation memory" into two separate functions. The first involves recycling relevant translation document pairs (the online translation archive) produced by translators (who belong to the potential community of online translators). QRselect realises this function. The second, which we refer to as "corpora exploration," involves collecting translation pairs from a wider range of data. This function is realised by and through QRcep, Maitake, and Eryngii.
QRselect constructs a personalised online translation archive, based on a set of keywords and/or URLs specified by individual translators. Through the QRselect system, a number of translators partially share the relevant translation documents as a reference source, through which the development of a functional networked community of related translators is promoted.
In Japan, maitake (Grifola frondosa) and eryngii (Pleurotus eryngii) have recently become commonly available in greengrocers. They're delicious and certainly worth discussing, but what we're concerned with here are not mushrooms but Maitake and Eryngii systems. Maitake automatically compiles bilingual proper-name dictionaries and Eryngii automatically augments and compiles bilingual technical-term dictionaries, both using multiple information sources including the Web. They constitute the core part of the augmentation and enhancement of basic reference sources. Both are called up from within QRedit. The figure shows the interface of the Maitake system.
QRcep（"cep" is the English and French name of the mushroom Boletus edulis) facilitates the look-up of terms in context from the Web and shows the actual use of terms in context. This function complements the information given in most technical-term dictionaries, which give only fixed translations and definitions. The figure illustrates the first prototype. The development of QRcep forms part of our research into the enhancement of look-up functions. From the point of view of users' information-seeking behaviour, QRcep is for exploring corpora, rather than dictionaries or archives.
The range of expressions that translators must deal with on a daily basis is much wider than what is described in linguistics. For instance, in linguistics it is commonly said that the idiom "kick the bucket" cannot be passivised into "the bucket is kicked". However, translators dealing with real-life texts will son find that "the bucket is kicked" is indeed used by authors as are "the breeze was shot" and "they went exact halves," among many others.
It is therefore necessary to match the standard form of idiom entries in dictionaries with textual tokens with variations. For instance, the expression "by hook, by genius, by hard work, or by crook" in a text should be matched by the dictionary entry "by hook or by crook". QRidiom realises this function. The figure illustrates the function of QRidiom integrated into the QRedit environment, in which "with his big fat tongue in his big fat cheek" is matched with the dictionary entry "with one's tongue in one's cheek".
This picture illustrates the traditional system of growing shiitake using "hodagi (natural logs)". Shiitake mushrooms grown on hodagi are said to taste better than those grown in sawdust. But some people in fact prefer the latter, precisely because their flavour is less intense.
The project leader was presented with this hodagi after winning a "mushroom quiz" at a mushroom foray and discussion session hosted by The Saitama Forest Supporters Club on 30 September, 2007.