Web cerik Snflandrmas icin Makine ogrenmesi Machine Learning for Web Content Classification









Abstract

Categorization of web sites is an important problem and has many practical applications. One such application is parental control for safe internet for children. Failure to classify websites by specific rules makes it difficult to access information, as well as leaving many users of different age groups with the harmful side of the Internet. Current secure internet solutions are not comprehensive or cannot be customized. Furthermore, the fact that the blocking orders issued by the courts do not cover all harmful sites and these websites change their domains so often. Thus, dynamic classification of websites using the text data is very important. In this study, using natural language processing and machine learning techniques websites are classified. Content of web sites from various languages are collected and preprocessed before applying machine learning techniques. In the study, 17 classes were used, the highest classification success was 0.8756 and this result was reached by the SVM method.


Modules


Algorithms


Software And Hardware

• Hardware: Processor: i3 ,i5 RAM: 4GB Hard disk: 16 GB • Software: operating System : Windws2000/XP/7/8/10 Anaconda,jupyter,spyder,flask Frontend :-python Backend:- MYSQL