Inducing translations from officially published materials in Canadian government websites
نویسندگان
چکیده
Bitexts collected from web-based materials that are officially published on government websites can be used for a variety of purposes in language analysis and natural language processing. Mining officially published web pages can thus be an invaluable undertaking for translators in government departments who are producing the translations, and for machine translation researchers who are studying how those translations are produced. In this paper, we present the StatCan Daily Translation Extraction System (SDTES) and demonstrate how it is used to induce translations from officially published bilingual materials from government websites in Canada. New evaluation results show that SDTES is a very effective system for identifying and extracting sentences that are translation pairs from most of the federal government web pages which are currently under the CLF2 (Common Look and Feel for the Internet 2.0) framework.
منابع مشابه
Searching for Poor Quality Machine Translated Text: Learning the Difference between Human Writing and Machine Translations
As machine translation (MT) tools have become mainstream, machine translated text has increasingly appeared on multilingual websites. Trustworthy multilingual websites are used as training corpora for statistical machine translation tools; large amounts of MT text in training data may make such products less effective. We performed three experiments to determine whether a support vector machine...
متن کاملTeen suicide information on the internet: a systematic analysis of quality.
OBJECTIVE To synthesize the literature on youth suicide risk factors (RFs) and prevention strategies (PSs); evaluate quality of information regarding youth suicide RFs and PSs found on selected Canadian websites; determine if website source was related to evidence-based rating (EBR); and determine the association of website quality indicators with EBR. METHODS Five systematic reviews of youth...
متن کاملEvaluating Government Website Accessibility: a Comparative Study
Even though efforts have been made to reduce the informational gap resulting from web inaccessibility, websites from virtually every type of organization have major accessibility problems. This study used an automated software tool to evaluate the accessibility of four Korean government and four U.S. government websites. Results were compared between the Korean and the U.S. government websites ...
متن کاملارزیابی خدمات بهداشتی درمانی وب سایت دانشگاههای علوم پزشکی کشور در راستای دولت الکترونیک
Background and Aim: Due to the role of websites in delivering e-services, this study aims to benchmark rendering healthcare services at medical universities' websites based on Chandler and Emanuel’s four-stage e-government maturity model. Materials and Methods: This is a descriptive, cross-sectional study which was conducted using content analysis and benchmarking to evaluate the delivery...
متن کاملRecent Canadian Experience in Machine Translation
The experience to be discussed is that of the Translation Bureau of the Government of Canada, which provides translation services to all federal departments and agencies, from Parliament to the National Film Board. Canada is officially bilingual. Under the Official Languages Act, this means that all services offered to the public by the federal government must be provided in both official langu...
متن کامل