Term Selection Term Selection Query - language Term Translation Doc - language Term Selection Term Weighting Term Matching Term Weighting Term Matching
نویسندگان
چکیده
This paper presents results for the Japanese/English cross-language information retrieval task on the NACSIS Test Collection. Two automatic dictionary-based query translation techniques were tried with four variants of the queries. The results indicate that longer queries outperform the required description-only queries and that use of the rst translation in the edict dictionary is comparable with the use of every translation. Japanese term segmentation posed no unusual problems, which contrasts sharply with results previously obtained for cross-language retrieval between Chinese and English.
منابع مشابه
Feature Selection for Natural Language Call Routing Based on Self-Adaptive Genetic Algorithm
The text classification problem for natural language call routing was considered in the paper. Seven different term weighting methods were applied. As dimensionality reduction methods, the feature selection based on self-adaptive GA is considered. k-NN, linear SVM and ANN were used as classification algorithms. The tasks of the research are the following: perform research of text classification...
متن کاملTerm Weighting in Short Documents for Document Categorization, Keyword Extraction and Query Expansion
This thesis focuses on term weighting in short documents. I propose weighting approaches for assessing the importance of terms for three tasks: (1) document categorization, which aims to classify documents such as tweets into categories, (2) keyword extraction, which aims to identify and extract the most important words of a document, and (3) keyword association modeling, which aims to identify...
متن کاملInvestigation of Term Weighting Schemes in Classification of Imbalanced Texts
Class imbalance problem in data, plays a critical role in use of machine learning methods for text classification since feature selection methods expect homogeneous distribution as well as machine learning methods. This study investigates two different kinds of feature selection metrics (one-sided and two-sided) as a global component of term weighting schemes (called as tffs) in scenarios where...
متن کاملInfluence of Different Culture Selection Methods on Polyhydroxyalkanoate Production at Short-term Biomass Enrichment
In this study, the potential of four different culture selection methods under short-term enrichment time (STE) to accumulate PHA-producing bacteria in mixed activated sludge was compared and the most efficient culture selection method was introduced. This means, PHA-producing microbial community was firstly enriched in a sequencing batch bioreactor (SBR) with four different selection methods i...
متن کامل