Improving the precision of the keyword-matching pornographic text filtering method using a hybrid model.
نویسندگان
چکیده
With the flooding of pornographic information on the Internet, how to keep people away from that offensive information is becoming one of the most important research areas in network information security. Some applications which can block or filter such information are used. Approaches in those systems can be roughly classified into two kinds: metadata based and content based. With the development of distributed technologies, content based filtering technologies will play a more and more important role in filtering systems. Keyword matching is a content based method used widely in harmful text filtering. Experiments to evaluate the recall and precision of the method showed that the precision of the method is not satisfactory, though the recall of the method is rather high. According to the results, a new pornographic text filtering model based on reconfirming is put forward. Experiments showed that the model is practical, has less loss of recall than the single keyword matching method, and has higher precision.
منابع مشابه
Improving Precision of Keywords Extracted From Persian Text Using Word2Vec Algorithm
Keywords can present the main concepts of the text without human intervention according to the model. Keywords are important vocabulary words that describe the text and play a very important role in accurate and fast understanding of the content. The purpose of extracting keywords is to identify the subject of the text and the main content of the text in the shortest time. Keyword extraction pl...
متن کاملA New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model
Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...
متن کاملHybrid Ontology for Semantic Information Retrieval Model Using Keyword Matching Indexing System
Ontology is the process of growth and elucidation of concepts of an information domain being common for a group of users. Establishing ontology into information retrieval is a normal method to develop searching effects of relevant information users require. Keywords matching process with historical or information domain is significant in recent calculations for assisting the best match for spec...
متن کاملPixel-Based Skin Detection for Pornography Filtering
A robust skin detector is the primary need of many fields of computer vision, including face detection, gesture recognition, and pornography filtering. Less than 10 years ago, the first paper on automatic pornography filtering was published. Since then, different researchers claim different color spaces to be the best choice for skin detection in pornography filtering. Unfortunately, no com...
متن کاملA Block-Grouping Method for Image Denoising by Block Matching and 3-D Transform Filtering
Image denoising by block matching and threedimensionaltransform filtering (BM3D) is a two steps state-ofthe-art algorithm that uses the redundancy of similar blocks innoisy image for removing noise. Similar blocks which can havesome overlap are found by a block matching method and groupedto make 3-D blocks for 3-D transform filtering. In this paper wepropose a new block grouping algorithm in th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Zhejiang University. Science
دوره 5 9 شماره
صفحات -
تاریخ انتشار 2004