Vol 20 no.1 2020
Department of Computer Science, Federal University of Lafia, Nasarawa State, Nigeria
Abstract
Data-driven models like Hadoop have gained tremendous popularity in big data analytics. Though great efforts have been made through the implementation of the Hadoop framework by decoupling of resource management infrastructure, the centralized design of metadata management of HDFS has adversely affected Hadoop scalability and has resulted in a performance bottleneck. A single master node called NameNode which manages the entire namespace (all the inodes) of a file system has resulted in a single point of failure, namespace limitation, and load balancing issues in the Hadoop cluster. This paper proposed a rack-aware model where each rack is provided with a Rack_Unit NameNode (RU_NN) to manage namespace of file system and heartbeat communication of DataNodes in its rack. This will reduce load on a single NameNode and will also provide less communication overhead from all DataNodes in the cluster to a single NameNode.