Retrofitting Word Vectors of MeSH Terms to Improve Semantic Similarity Measures

نویسندگان

  • Zhiguo Yu
  • Trevor Cohen
  • Byron C. Wallace
  • Elmer V. Bernstam
  • Todd R. Johnson
چکیده

Estimation of the semantic relatedness between biomedical concepts has utility for many informatics applications. Automated methods fall into two broad categories: methods based on distributional statistics drawn from text corpora, and methods based on the structure of existing knowledge resources. In the former case, taxonomic structure is disregarded. In the latter, semantically relevant empirical information is not considered. In this paper, we present a method that retrofits the context vector representation of MeSH terms by using additional linkage information from UMLS/MeSH hierarchy such that linked concepts have similar vector representations. We evaluated the method relative to previously published physician and coder’s ratings on sets of MeSH terms. Our experimental results demonstrate that the retrofitted word vector measures obtain a higher correlation with physician judgments. The results also demonstrate a clear improvement on the correlation with experts’ ratings from the retrofitted vector representation in comparison to the vector representation without retrofitting.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enhancing MEDLINE document clustering by incorporating MeSH semantic similarity

MOTIVATION Clustering MEDLINE documents is usually conducted by the vector space model, which computes the content similarity between two documents by basically using the inner-product of their word vectors. Recently, the semantic information of MeSH (Medical Subject Headings) thesaurus is being applied to clustering MEDLINE documents by mapping documents into MeSH concept vectors to be cluster...

متن کامل

Multilevel Measures of Document Similarity

Many applications such as document summarization, passage retrieval and question answering require a detailed analysis of semantic relations between terms within and across documents and sentences. Often one has a number of sentences or paragraphs and has to choose the candidate with the highest level of relevance for the topic or question. An additional requirement may be that the information ...

متن کامل

Two Similarity Metrics for Medical Subject Headings (MeSH):

In the present paper, we have created and characterized several similarity metrics for relating any two Medical Subject Headings (MeSH terms) to each other. The article-based metric measures the tendency of two MeSH terms to appear in the MEDLINE record of the same article. The author-based metric measures the tendency of two MeSH terms to appear in the body of articles written by the same indi...

متن کامل

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

Exploring the Validity of Corpus-derived Measures of Semantic Similarity

Lexical co-occurrence counts from large corpora have been used to construct highdimensional vector-space models of language. In this type of model words are represented as vectors (or points) in a hyperspace, and distances between word vectors are generally considered to reflect semantic similarity. Two issues must be addressed if a vector-space model is to be used as a 'semantic' measuring dev...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016