Large-scale data processing is growing rapidly as enterprises are moving towards big data projects. Big enterprises are also
maintaining distributed data centers across the globe for disaster recovery and business continuance. After experiencing the success of
big data projects, need of running future big data projects on distributed data centers arises. In that case, existing resource management
solutions such as Apache YARN or Mesos fails as they still have a centralized resource manager. So for extreme scale data centers or
distributed data centers, we need a new generation distributed resource management solution.
Published In:IJCSN Journal Volume 6, Issue 6
Date of Publication : December 2017
Pages : 777-786
Figures :06
Tables : --
M Sai Pradeep : currently pursuing Masters at Indian Institute of
Technology, Delhi. He has completed B.E. from M S Ramaiah
Institute of Technology, Bangalore in Computer Science on 2012. He
has worked in Samsung R&D, Noida for 1 year and currently working
at Indian Oil Corporation Ltd. His research interests includes Big
Data, Cloud computing and Data Analytics.
Harish Mamilla : currently pursuing Masters at Indian Institute of
Technology, Delhi. He has worked in Honeywell Pvt Ltd.
S C Gupta : Department of Computer Science & Engg, IIT-Delhi,
Delhi, 110016, India .
We have discussed a distributed resource management
layer solution which allows distributed as well as extreme
scale data centers to share resources in an efficient and
controlled manner. Existing resource manager solutions
such as YARN and Mesos does not address the distributed
and extreme scale data centers issues as they have a
centralized host to manage resources. Our solution
distributes that module so that centralized RM will not be
a bottleneck. It can be easily scalable by adding a new
sub-cluster. Policy maker host manages the whole cluster
but sub-clusters are not dependent on the always-on policy
maker host. Data center requirements such as load
balancing, trigger draining of sub-clusters that will
undergo maintenance etc. can easily be handled by
enforcing policies via policy maker. If the policy maker is
not available, cluster operations will continue as per last
published policies. Together these elements make our
solution feasible to all distributed and extreme scale data
centers.
[1] Mohit Aron, Peter Druschel, Willy Zwaenepoel, Cluster
Reserves: A Mechanism for Resource Management in
Cluster based Network Servers, Rice University
[2] Vinod, Arun, Chris, Sharad, Robert, Thomas, Jason,
Carlo , Apache Hadoop YARN : Yet Another Resource
Negotiator, SoCC13, 13 Oct. 2013, Santa Clara,
California, USA. ACM 978-1-4503-2428-1.
[3] Benjamine et al. Hindman, Mesos: A platform for fine
grained resource sharing in the data Centre, in
Proceedings of the 8th USENIX conference on
Networked systems design and implementation,2011
[4] Ke Wang , Ning Liu , Iman Sadooghi , Xi Yang ,
Xiaobing Zhou , Tonglin Li , Michael Lang , Xian-He
Sun , Ioan Raicu, Overcoming Hadoop Scaling
Limitations through Distributed Task Execution, in
2015 IEEE International Conference on Cluster
Computing
[5] Robbert van Renesse, Yaron Minsky, and Mark Hayden,
A Gossip-Style Failure Detection Service , , Dept. of
Computer Science, Cornell University 4118 Upson Hall,
Ithaca, NY 14853 [6] Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy
Konwinski, Scott Shenker, Dominant Resource
Fairness: Fair Allocation of Multiple Resource Types ,
Ion Stoica University of California, Berkeley
[7] Arka A. Bhattacharya1 , David Culler1 , Eric
Friedman2 , Ali Ghodsi1 , Scott Shenker1, Hierarchical
Scheduling for Diverse Datacenter Workloads
University of California, Berkeley
[8] Apache Myriad Online
[https://www.youtube.com/watch?v=aXJxyEnkHd4]
[9] http://mesos.apache.org/
[10] https://hadoop.apache.org/docs/r2.7.1/hadoopyarn/hadoop-yarnsite/YARN.html
[11] Cloudera Blog for Yarn
[12] http://www.adaptivecomputing.com/products/opensource/torque/
[13] FENG LI and BENG CHIN OOI, M. TAMER OZSU,
SAI WU. 2014. Distributed Data Management Using
MapReduce, In ACM Computing Surveys, 2013
[14] Apache Software Foundation (2013 Oct) [Online] :
http://hadoop.apache.org/docs/current/hadoopyarn/hadoop-
yarnsite/YARN.html
[15] Apache Software Foundation [Online] :
http://www.apache.org/
[16] Jeffrey, and Sanjay Ghemawat Dean, MapReduce:
simplified data processing on large clusters, in
Communications of the ACM 51.1(2008): 107-113