Indonesian Language Stemmer Algorithm Improvement By Rearrange Stemming Process Steps Sequence

Hari Widayanto, Arief Huda

Abstract

Stemming is a processs to find root word from its compounded form by removing all affixes are attached on it. Stemmer was applied in various text mining application to improve application performance, such as in Information Retrieval stemmer could improve performance by providing variant morphological searched terms and reduce size of index [9]. In word based text compression, stemmer could simplify the dictionary as various word from could be represented by one word [6]. Besides reduce size of document index, stemmer could increase text retrieval accuracy [10]. In text classification stemmer reduce the number of features [18]. The first Indonesian stemmer was developed by Nazief-Adriani then Jelita Asian improved the algorithm called confix stripping (CS) stemmer. There were heaps of improvement was done by CS stemmer so it is highest accuracy stemmer algorithm. Experiment would be performed to compare the accuracy between Nazief – Adriani and CS stemmeralgorithm for stemm words were extracted from online news, Republika. Keywords : Stemming, Indonesian, Nazief-Adriani, CS stemmer

Full Text:

PDF

Refbacks

  • There are currently no refbacks.
max_upload :0