TMan - Subsentence - level Replacement , Multilingual Document Generation and Data Conversion
نویسنده
چکیده
Man is a limited-distribution application and the place it occupies in the activities of the Commission's Translation Service (and other departments) is less clear-cut than is the case for other tools. Indeed, if we were starting from scratch, it is probable that we would not choose to invent an application incorporating such a miscellany of functions and steps are now being taken to remedy this anomaly by migrating the primary mass-use function of TMan (subsentence-level replacement) to a more rational and integrated client-server environment. Nevertheless, the functions covered by TMan are of interest in some respects precisely because they were developed in an ad hoc fashion to meet imperative user needs not covered by other applications or projects, or to take over such functions from applications rendered obsolete by technical developments. TMan has not infrequently been the only means of providing a more-or-less immediate solution to such problems, since it is the only major language support application in the Service to be developed entirely in-house using the principles of Rapid Application Development (albeit in a rather haphazard way). To go into the application's historical development (starting as far back as the late '80s) now would, however, be of little more than academic interest, so this article will simply review the functions now performed by TMan, as well as likely developments. Given the somewhat eclectic nature of the application, this presentation will necessarily be anything but highly structured.
منابع مشابه
A "Pivot" XML-Based Architecture for Multilingual, Multiversion Documents: Parallel Monolingual Documents Aligned Through a Central Correspondence Descriptor and Possible Use of UNL
We propose a structure for multilingual, multiversion documents, built on the model of the web-oriented, cooperative lexical multilingual data base PAPILLON: a document is represented by a collection of monolingual XML "volumes" interlinked by a central volume of "interlingual links". Here, the links relate subdocuments (XML trees) corresponding to each other in monolingual "volumes". We are de...
متن کاملProbabilistic topic modeling in multilingual settings: An overview of its methodology and applications
Probabilistic topic models are unsupervised generative models which model document content as a two-step generation process, that is, documents are observed as mixtures of latent concepts or topics, while topics are probability distributions over vocabulary words. Recently, a significant research effort has been invested into transferring the probabilistic topic modeling concept from monolingua...
متن کاملKey Technologies for Multilingual Information Processing on WWW
This paper discusses key technologies required to realize a document database which is the multilingual collection of documents typically seen on WWW, and to realize a system which supports easy access to such multilingual information. Specifically, we focus on such techniques as 1) crosslanguage information retrieval (CLIR), which supports conversion of cultural factors such as units, era name...
متن کاملProbabilistic Topic Modeling in Multilingual Settings: A Short Overview of Its Methodology and Applications
Probabilistic topic models are unsupervised generative models that model document content as a two-step generation process, i.e., documents are observed as mixtures of latent topics, while topics are probability distributions over vocabulary words. Recently, a significant research effort has been invested into transferring the probabilistic topic modeling concept from monolingual to multilingua...
متن کاملGross-grained RST through XML Metadata for Multilingual Document Generation
We present an RST-based discourse annotation proposal used in the construction of a trial multilingual XML-tagged corpus of teaching material in Basque, English and Spanish. The corpus feeds an experimental multilingual document generation system for the web. The main contributions of this paper are an implementation of RST through XML metadata and the adoption of gross-grained RST to avoid non...
متن کامل