A Topic-Oriented Syntactic Component Extraction Model in Social Media
نویسندگان
چکیده
Topic-oriented understanding is to extract information from various language instances, which reflects the characteristics or trends of semantic information related to the topic via statistical analysis. The syntax analysis and modeling is the basis of such work. Traditional syntactic formalization approaches widely used in natural language understanding could not be simply applied to the text modeling in the context of topic-oriented understanding. In this paper, we review the information extraction mode, and summarize its inherent relationship with the “SubjectPredicate” syntactic structure in Aryan language. And we propose a syntactic element extraction model based on the “topic-description” structure, which contains six kinds of core elements, satisfying the desired requirement for topic-oriented understanding. This paper also describes the model composition, the theoretical framework of understanding process, the extraction method of syntactic components, and the prototype system of generating syntax diagrams. The proposed model is evaluated on the Reuters 21578 and SocialCom2009 data sets, and the results show that the recall and precision of syntactic component extraction are up to 93.9% and 88%, respectively, which further justifies the feasibility of generating syntactic component through the word dependencies.
منابع مشابه
Won’t somebody please think of the children? Improving Topic Model Clustering of Newspaper Comments for Summarisation
Online newspaper articles can accumulate comments at volumes that prevent close reading. Summarisation of the comments allows interaction at a higher level and can lead to an understanding of the overall discussion. Comment summarisation requires topic clustering, comment ranking and extraction. Clustering must be robust as the subsequent extraction relies on a good set of clusters. Comment dat...
متن کاملA Model for Detecting of Persian Rumors based on the Analysis of Contextual Features in the Content of Social Networks
The rumor is a collective attempt to interpret a vague but attractive situation by using the power of words. Therefore, identifying the rumor language can be helpful in identifying it. The previous research has focused more on the contextual information to reply tweets and less on the content features of the original rumor to address the rumor detection problem. Most of the studies have been in...
متن کاملTowards SoMEST – Combining Social Media Monitoring with Event Extraction and Timeline Analysis
We report on the development of a social media monitoring tool based on the novel Social Media Event Sentiment Timeline (SoMEST) model. The novelty of our model is that it combines opinion mining techniques with a timeline-based event analysis method and an information and event extraction tool. While Event Timeline Analysis (ETA) is an existing method utilized in analyzing the external environ...
متن کاملMaking Social Media Analysis more efficient through Taxonomy Supported Concept Suggestion
Social Media sites provide consumers the ability to publicly create and shape the opinion about products, services and brands. Hence, timely understanding of content created in social media has become a priority for marketing departments, leading to the appearance of social media analysis applications. This article describes an approach to help users of IBM Cognos Consumer Insight, IBM’s social...
متن کاملتحلیل کیفی علل گرایش خانواده ها به رسانه های مجازی
Introduction: society and the institution of the family in Iran in recent decades have been dramatic changes. This primarily changes is due to in various aspects of different component, and most recently has been the most important component of communication technology. Knowing the causes of these changes in the family structure guides us into a dynamic community and a family. This study exa...
متن کامل