Identifikasi Teks Gereflekter Pada Buku Anak Dengan Algoritma K-nearest Neighbor
Abstract
Abstrak Buku anak merupakan salah satu sumber pengetahuan bagi pembaca, khususnya anak. Ketika buku itu dibaca, anak akan berusaha memaknai setiap kata dan kalimat di dalamnya. Terdapat permasalahan ketika ditemukan kesalahan konten pada buku tersebut. Konten yang dimaksud yaitu kata maupun kalimat yang memiliki makna kurang sopan, seksual, serta kata kasar. Bagi anak-anak di tingkat sekolah dasar konten tersebut menjadi hal yang bermakna gereflekter (tabu). Berdasarkan permasalah tersebut, maka dilakukan penelitian tugas akhir terhadap cerita anak yang diambil dari buku fiksi dan buku pelajaran. Penelitian ini dilakukan dengan membangun sistem untuk mendeteksi konten gereflekter pada teks cerita yang dijadikan sebagai dataset. Penelitian dilakukan dengan membangun model menggunakan algoritma klasifikasi teks k-Nearest Neighbor dengan pendekatan distance measure. Distance measure yang digunakan adalah Euclidean Distance dan Manhattan Distance. Sistem dievaluasi dengan menggunakan precision, recall, dan F1 score. Berdasarkan hasil evaluasi, skenario pengujian menggunakan Euclidean distance dan Manhattan distance mendapatkan nilai precision 0.915, recall 0.845, dan F1 score 0.895. Kata kunci : buku anak, distance measure, gereflekter, k-Nearest Neighbor Abstract Children's books are one source of knowledge for readers, especially children. When the book is read, the child will try to make sense of every word and sentence in it. There was a problem when a content error was found in the book. The content in question is words and sentences that have meanings that are not polite, sexual, and rude words. For children at the elementary school level, the content becomes meaningful reflectivity (taboo). Based on these problems, a final assignment research was carried out on children's stories taken from fiction books and textbooks. This research was conducted by building a system for detecting reflector content on story text that is used as a dataset. The study was conducted by building a model using the k-Nearest Neighbor text classification algorithm with a distance measure approach. Distance measure used is Euclidean Distance and Manhattan Distance. The system is evaluated using precision, recall, and F1 score. Based on the evaluation results, testing scenarios using Euclidean distance and Manhattan distance get a precision value of 0.915, recall 0.845, and F1 score 0.895. Keywords: children’s book, distance measure, gereflekter, k-Nearest NeighborDownloads
Published
2020-04-01
Issue
Section
Program Studi S1 Informatika