Shiitake Project

"Translation for all" has become publicly available!

The first term of the Shiitake project was closed on March 31, 2009. The second term will continue from April 1, 2009 for four years. This site will not be updated from April 9, 2009.

We are pleased to announce that an integrated translation-aid and translated document publication environment "Translation for all" is now publicly available (sorry, the interface is only in Japanese for now; the English interface will become available soon). "Translation for all" is a joint project of Language Translation Group of the National Institute of Information and Communication Technology and Library and Information Science Laboratory of the University of Tokyo, in cooperation with Sanseido publisher, one of the most prestigious dictionary company in Japan.

* * *

Developing a system to aid online translators

In a reaction to the ongoing degradation of the established mass media, alternative information dissemination via the Internet has become increasingly important. The translation of news and reports by volunteers plays a vital role in this alternative information flow. In order to specifically aid these volunteer translators, we initiated a 4-year project (the "Shiitake" project) in 2005.

In the domain of natural language processing, people tend to regard language processing as computation. Human language experts such as translators and editors, however, have a strong understanding of their language activities as a social process, in which they refer-back to and build upon what others have previously said and written.

* * *

The Shiitake project aims to make it easier for translators access and recycle this historically accumulated stock of language by developing mechanisms for collecting scattered texts from the Internet along with their historical log, and enhancing content and look-up functions of electronic reference sources. Though at first glance the recycling of existing language expressions resembles "example-based machine translation" (EBMT), it is different from EBMT in that what we provide is reference to relevant language expressions and not structurally similar examples.

This system will ultimately enable individual translators to form a network, without any additional or conscious effort, in which translators working on similar or related domains can collectively accumulate translation data and share information relevant to their translations. This, we believe, will enhance the activity of translators and encourage potential translators to join in. As such, the Shiitake project is ultimately a social project.

We have also started developing a module that specifically aims to help inexperienced translators.

Many excellent CAT systems have been developed so far, and are used by professional translators or translation companies. There are three major differences between these CAT systems and our project/system: (1) the Shiitake project has modules that construct reference sources from the Web; (2) the Shiitake project specifically aims at aiding online volunteer translators; and (3) the interface (integrated translation editor environment) in the version for personalised use is very simple (in this way our interface is similar to the TransType system developed by the University of Montreal team, though the functional specifications are completely different).

* * *

In regards to the human process of translation (not the boring cognitive process but the exciting social process of providing social products that contribute to the accumulation of information), the project leader has received a great deal of advice from the administrators of the news translation site Tea not war, volunteer translators involved in the Global Voices project, and other volunteer translators.

In regards to the engineering aspect of the project, the project leader carries out research in cooperation with Nagoya University, Tsukuba University, Okayama University, Hokkaido University, National Institute of Informatics, LINA-CNRS, the University of Nantes (France) and CNRS-GETA, Joseph Fourier University (France).

Basic modules and functions have already been implemented, and a limited number of monitors are currently testing the system.


Kyo Kageura
Library and Information Science Laboratory
Graduate School of Education
University of Tokyo
7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan
Tel & fax: +81-3-5841-3973
E-mail: kyo [atmark goes here] p . u - tokyo . ac . jp

Page Top