using an automatic weighted keywords dictionary for intelligent web content filtering

نویسندگان

najibeh farzi veijouyeh

jamshid bagherzadeh

چکیده

filtering of web pages with inappropriate contents is one of the major issues in the field of intelligent network's security. having a good intelligent filtering method with high accuracy and speed is needed for any country in order to control users' access to the web. so, it has been considered by many researchers. presenting web pages in an understandable way by machines is one of the most important preprocessing steps. thus, offering a way to describe web pages with lower dimensions would be very effective, especially in determining the nature of web pages with respect to whether they should be filtered out or not. in this paper, we propose an automatic method to detect forbidden keywords from web pages. next, we define a new representation of web pages in vector form which consists of weighted sum and frequency of forbidden keywords in different parts of web pages named rwsf. for this, a ranking dictionary of keywords including forbidden keywords is used. to evaluate the proposed method, 2643 pages consisting of 1311 normal pages and 1332 forbidden pages were used. among these, 1851 pages were used to train the system and 792 pages were used for system evaluation. the system has been assessed using various classifiers such as: k-nearest neighbor, support vector machines, decision tree and artificial neural networks. evaluation results indicate the high efficiency and accuracy of the proposed method in all classifiers.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Web Rating: Filtering Obscene Content on the Web

We present a method to detect automatically pornographic content on the Web. Our method combines techniques from language engineering and image analysis within a machine-learning framework. Experimental results show that it achieves nearly perfect performance on a set of hard cases.

متن کامل

Automatic Keywords Extraction – a Basis for Content Recommendation

This paper describes a use case for an application that recommends learning objects for reuse and is integrated in the authoring environment. The recommendations are based on the automatic detection of content being authored and the context in which this resource is authored or used. The focus of the paper is automatic keyword extraction, evaluated as a starting point for content analysis. The ...

متن کامل

An Intelligent Content Filter based framework for Mobile Web Services

Since mobile and internet penetration is highly increasing all over the world and these technological advancements of mobile devices are changing every time, most of people are using their mobile devices for their day to day transactions, business etc., and access services from the internet. Today, lot of search engines are available, these search engines will dump information abundantly. Peopl...

متن کامل

Named Entity Recognition for Web Content Filtering

Effective Web content filtering is a necessity in educational and workplace environments, but current approaches are far from perfect. We discuss a model for text-based intelligent Web content filtering, in which shallow linguistic analysis plays a key role. In order to demonstrate how this model can be realized, we have developed a lexical Named Entity Recognition system, and used it to improv...

متن کامل

QoS-based Web Service Recommendation using Popular-dependent Collaborative Filtering

Since, most of the organizations present their services electronically, the number of functionally-equivalent web services is increasing as well as the number of users that employ those web services. Consequently, plenty of information is generated by the users and the web services that lead to the users be in trouble in finding their appropriate web services. Therefore, it is required to provi...

متن کامل

Intelligent Web Services System for automatic framework

Recently Web services have become a key technology which is indispensable for e-business due to its ability to provide the desired information or service regardless of time and place by integrating current application systems within a single business or between multiple businesses with standardized technologies using the open network and Internet. However, the current Web Services Retrieval Sys...

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید


عنوان ژورنال:
journal of advances in computer research

ناشر: sari branch, islamic azad university

ISSN 2345-606X

دوره 6

شماره 1 2015

میزبانی شده توسط پلتفرم ابری doprax.com

copyright © 2015-2023