Analisis Penggunaan Algoritma Delay Scheduling terhadap Karakteristik Job Scheduling pada Hadoop

Authors

  • Komaratih Dian Priharyani Telkom University
  • Gandeva Bayu Satrya Telkom University
  • Anton Herutomo Telkom University

Abstract

Hadoop is a Java-based software framework and open-source that serves to process large amounts of data are distributed and run on a cluster that consists of several computers connected together. Hadoop has advantages in terms of economic because not pay, and can be implemented in hardware with a specification that is not too high. Hadoop architecture consists of two layers are layers and layers of MapReduce Hadoop Distributed File System (HDFS). MapReduce is a framework of distributed applications while the Hadoop Distributed File System (HDFS) is a distributed data. Delay Scheduling is a job scheduler that is being developed in a multi-node Hadoop system and has the handling characteristics in the queue for job scheduling. Delay Scheduling jobs to apply the method further delay path to improve data locality in advance so that the lower value in the job file. Additionally, perform nearly optimal data allocation so that the effect on the Job Fail Rate, Job Throughput and Response Time. Delay Scheduling algorithm has an effective performance with a reduction in the Job Fail Rate 0.3%, 8.853% increase in job throughput, and faster 142 minutes 45 seconds Response Time with the type of job characteristics Wordcount in the amount of 50 jobs. Key words: hadoop, hadoop multi-node, delay scheduling, FIFO.

Downloads

Published

2015-04-01

Issue

Section

Program Studi S1 Informatika