Determination of the Best Vehicle Pathway with Classification of Data Mining Twitter using K-Nearest Neighbor








Abstract

The system developed by the author is a system that can determine the best path if one of the roads experience vehicle density or loss. The roads specified in this system are very limited, just the road around Bundaran HI Jakarta. In this research, congestion data was taken from Twitter social media because there are many tweets on Twitter which stated the traffic situation in Jakarta. Before being classified, the data will go through a pre-processing process consisting of Case Folding, Cleaning, Tokenization, and Data Transformation. This system uses the K-Nearest Neighbor (KNN) algorithm to classify Twitter data. The amount of data used in this study is 600 data and the data was tested three times. In the first test the data is divided into 50% training data and 50% testing data, while in the second test the data is divided by 67% training data and 33% testing data. Finally, the data is divided into 80% training data and 20% testing data for the third test. From these tests, the highest performance was obtained from the third test with an accuracy of 84.16%, precision 96.00%, and recall 84.00% with the number of neighbors (K) is 27.


Modules


Algorithms

Data Mining algorithms


Software And Hardware

• Hardware: Processor: i3 ,i5 RAM: 4GB Hard disk: 16 GB • Software: operating System : Windws2000/XP/7/8/10 Anaconda,jupyter,spyder,flask Frontend :-python Backend:- MYSQL