Spark-based Spatial Association Mining








Abstract

Spatial association mining, as one of important techniques for spatial data mining, is used to discover interesting relationship patterns among spatial features based on spatial proximity from a large spatial database. Explosive growth in georeferenced data has emphasized the need to develop computationally efficient methods for analyzing big spatial data. Parallel and distributed computing is effective and mostly-used strategy for speeding up large scale dataset algorithms. This work presents parallel spatial association mining on the Spark RDD framework - a specially-designed in-memory parallel computing model to support iterative algorithms. The initial experiment result shows that the Spark-based algorithm has significantly improved performance than the method with MapReduce in spatial association pattern mining.


Modules


Algorithms

Data Mining algorithms


Software And Hardware

• Hardware: Processor: i3 ,i5 RAM: 4GB Hard disk: 16 GB • Software: operating System : Windws2000/XP/7/8/10 Anaconda,jupyter,spyder,flask Frontend :-python Backend:- MYSQL