IMPLEMENTATION OF OPTICAL CHARACTER RECOGNITION SYSTEM USING THE TESSERACT ALGORITHM TO DETECT INDONESIAN TRAIN LABELS
Abstract
The rail transportation system is a very important sector in people's lives, because it has a vital role in connecting regions and driving economic growth. Railroad car labels are used to identify the purpose and type of carriage used. The rail transportation system is a field that requires efficient data processing. Collecting data from railroad car labels manually still requires quite a long time and is prone to errors. The method used in this study is image processing, namely the OCR (Optical Character Recognition) method which functions as a reading of images into characters. The ROI (Region of Interest) method used to select labels from train cars so we can focus on the important part of the image and remove irrelevant parts of the image, then images are processed using OpenCV. The results of the image processing are read by Tesseract OCR, then the text results from the labels of the train cars will be displayed on the website. The test results of this system showed that it can detect text from railroad car labels in real-time at different distance parameters, namely 10cm, 20cm, 30cm, 40cm, 50cm and 60cm with a shooting angle of 90° and lighting levels of 45 Lux, 90 Lux, 120 Lux and 210 Lux. This OCR character reading is 100% accuracy.
Keywords: Optical Character Recognition, image processing, Region of Interest, train labels.
References
Muthukumaran, K., & Palanisamy, P. (2017). Review of Optical Character Recognition Techniques. International Journal of Engineering Science and Computing, 7(8), 15345-15352.
Kementrian Perhubungan Republik Indonesia (2017, Oktober 3). Transportasi Sebagai Pendukung Sasaran Pembangunan Nasional. https://dephub.go.id/post/read/transportasi-sebagai-pendukung-sasaranpembangunan-nasional.
PM. 54 Tahun 2016, Standar Spesifikasi Teknis Identitas Sarana Perkeretaapian. Menteri Perhubungan Republik Indonesia.2016.
Gonzalez, R. C., & Woods, R. E. (2018). Digital image processing. Pearson Education India.
Burger, W., & Burge, M. J. (2016). Principles of digital image processing: Core algorithms. Springer International Publishing.
Pratt, W. K. (2018). Digital image processing: PIKS Scientific inside. John Wiley & Sons.
Setiawan, M. A., Faza, M. A., & Setiawan, W. (2017). Sistem pengenalan karakter menggunakan optical character recognition (OCR) pada plat nomor kendaraan bermotor menggunakan metode template matching. Jurnal Teknologi dan Sistem Komputer, 5(4), 182-189.
Ahmed, M., Ahmed, M. M., Al-Turjman, F., & Alghathbar, K. (2018). Optical character recognition (OCR) for printed urdu text using artificial neural networks (ANNs). Applied Sciences, 8(11), 2198.
Smith, R. (2007). An overview of the Tesseract OCR engine. Document Analysis and Recognition, ICDAR 2007. Ninth International Conference on (pp. 629-633). IEEE.
Patil, S. R., & Nemade, N. (2016). Optical character recognition using tesseract OCR engine. International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), 5(3), 1211-1214.
Bradski, G. (2000). The OpenCV Library. Dr. Dobb's Journal of Software Tools.
Kaehler, A., & Bradski, G. (2017). Learning OpenCV 3: Computer vision in C++ with the OpenCV library. " O'Reilly Media, Inc.".
OpenCV. (n.d.). Diakses pada 19 April 2022, dari https://opencv.org/
S. Thakare, A. Kamble, V. Thengne and U. R. Kamble, "Document Segmentation and Language Translation Using Tesseract-OCR," 2018 IEEE 13th International Conference on Industrial and Information Systems (ICIIS), 2018, pp. 148-151, doi: 10.1109/ICIINFS.2018.8721372
Tangwannawit, S. 2016. Recognition of Lottery Digits Using OCR Technology. 12th international conference on Signal – image Technology & Internet – based system (SITIS), PP. 632-636.
C. Liyanage, T. Nadungodage and R. Weerasinghe, "Developing a commercial grade Tamil OCR for recognizing font and size independent text," 2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer), 2015, pp. 130-134, doi: 10.1109/ICTER.2015.7377678.



