This thesis gives a working example on how to design and implement a plagiarism detection software in Java using various libraries. The software uses websites as datasources to determine if a text or a file is a plagiarized document or not. Plagiarism is any identical or lightly-altered use of ones own or someone elses work. Text plagiarism detection systems are widely available. Students achieve the best results in learning by writing and doing exercises. This mandates a large number of written exercises. However limited resources and distribution of assessment work lead to problems when students answers need to be checked for plagiarism. Plagiarism or copy pasting is di_cult to notice in a large volume of documents. The demonstrated project focuses on computer-assisted plagiarism detection in medium to large volumes of text-based submissions. Moreover the project supports automated search for web sources.The first chapter of the report is describing how the software works which tools were used with of a focus on the different Java libraries. This part deals also with the technical requirements.
n gram- k gram