Improving the precision of the keyword-matching pornographic text filtering method using a hybrid model.

نویسندگان

  • Gui-yang Su
  • Jian-hua Li
  • Ying-hua Ma
  • Sheng-hong Li
چکیده

With the flooding of pornographic information on the Internet, how to keep people away from that offensive information is becoming one of the most important research areas in network information security. Some applications which can block or filter such information are used. Approaches in those systems can be roughly classified into two kinds: metadata based and content based. With the development of distributed technologies, content based filtering technologies will play a more and more important role in filtering systems. Keyword matching is a content based method used widely in harmful text filtering. Experiments to evaluate the recall and precision of the method showed that the precision of the method is not satisfactory, though the recall of the method is rather high. According to the results, a new pornographic text filtering model based on reconfirming is put forward. Experiments showed that the model is practical, has less loss of recall than the single keyword matching method, and has higher precision.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Precision of Keywords Extracted From Persian Text Using Word2Vec Algorithm

Keywords can present the main concepts of the text without human intervention according to the model. Keywords are important vocabulary words that describe the text and play a very important role in accurate and fast understanding of the content. The purpose of extracting keywords is to identify the subject of the text and the main content of the text in the shortest time. Keyword extraction pl...

متن کامل

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...

متن کامل

Hybrid Ontology for Semantic Information Retrieval Model Using Keyword Matching Indexing System

Ontology is the process of growth and elucidation of concepts of an information domain being common for a group of users. Establishing ontology into information retrieval is a normal method to develop searching effects of relevant information users require. Keywords matching process with historical or information domain is significant in recent calculations for assisting the best match for spec...

متن کامل

Pixel-Based Skin Detection for Pornography Filtering

A robust skin detector is the primary need of many fields of computer vision, including face detection, gesture recognition, and pornography filtering. Less than 10 years ago, the first paper on automatic pornography filtering was published. Since then, different researchers claim different color spaces to be the best choice for skin detection in pornography filtering. Unfortunately, no com...

متن کامل

A Block-Grouping Method for Image Denoising by Block Matching and 3-D Transform Filtering

Image denoising by block matching and threedimensionaltransform filtering (BM3D) is a two steps state-ofthe-art algorithm that uses the redundancy of similar blocks innoisy image for removing noise. Similar blocks which can havesome overlap are found by a block matching method and groupedto make 3-D blocks for 3-D transform filtering. In this paper wepropose a new block grouping algorithm in th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Zhejiang University. Science

دوره 5 9  شماره 

صفحات  -

تاریخ انتشار 2004