文献总结 - [2015] Large-scale cluster management at Google with Borg

原文链接:Large-scale cluster management at Google with Borg

一、主旨

Borg是Google内部从21世纪初就开始研发的大规模集群管理系统,属于底层基础架构,服务上层业务应用的部署和运行。
Borg对K8s有深远意义,从某种意义上可以说Borg是K8s的前身,但两者并不一样,且时至今日在Google内部Borg也并没有被K8s替代,应当也没有被替代的趋势。

二、内容

三、对K8s的影响

  1. 应用级别的多构件关联
    1. Borg中只有job概念可以串联多构件
    2. K8s中可通过label来松散串联多构件
  2. 网络地址空间
    1. Borg中所有任务共用节点IP和端口,容器端口直接映射节点端口,空间狭小
    2. K8s中pod自带网络地址空间,上层还有svc可以封装和控制是否映射到节点端口
  3. 用户倾向
    1. Borg最初目的是优先服务大团队
    2. K8s社区决定玩家结构
  4. 多容器的调度单元——pod
    1. Alloc in Borg: app + logsaver + dataloader 模式,分段开发维护
    2. Pod in K8s: init containers + helper containers/sidecars
  5. 运维的可观测辅助
    1. Debugging information
    2. Events
  6. Master即内核——Borgmaster in Borg = kube-apiserver in K8s

四、同类系统

SysName Ref
Apache Mesos B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi,A. Joseph, R. Katz, S. Shenker, and I. Stoica.Mesos: aplatform for fine-grained resource sharing in the data center.InProc. USENIX Symp. on Networked Systems Design andImplementation (NSDI), 2011
YARN V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal,M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, S. Seth,B. Saha, C. Curino, O. O’Malley, S. Radia, B. Reed, andE. Baldeschwieler.Apache Hadoop YARN: Yet AnotherResource Negotiator.InProc. ACM Symp. on CloudComputing (SoCC), Santa Clara, CA, USA, 2013.
Tupperware A. Narayanan.Tupperware: containerized deployment atFacebook.http://www.slideshare.net/dotCloud/tupperware-containerized-deployment-at-facebook,June 2014.
Apache Aurora (retired) Apache Aurora.http://aurora.incubator.apache.org/, 2014.
Autopilot https://aurora.apache.org/
Quincy (on Borg) M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar,and A. Goldberg.Quincy: fair scheduling for distributedcomputing clusters.InProc. ACM Symp. on OperatingSystems Principles (SOSP), 2009.
Cosmos P. Helland.Cosmos: big data and big challenges.http://research.microsoft.com/en-us/events/fs2011/helland\_cosmos\_big\_data\_and\_big\_challenges.pdf, 2011.
Apollo E. Boutin, J. Ekanayake, W. Lin, B. Shi, J. Zhou, Z. Qian,M. Wu, and L. Zhou.Apollo: scalable and coordinatedscheduling for cloud-scale computing.InProc. USENIXSymp. on Operating Systems Design and Implementation(OSDI), Oct. 2014.
Fuxi Z. Zhang, C. Li, Y. Tao, R. Yang, H. Tang, and J. Xu.Fuxi: afault-tolerant resource management and job schedulingsystem at internet scale.InProc. Int’l Conf. on Very LargeData Bases (VLDB), pages 1393–1404. VLDB EndowmentInc., Sept. 2014.
Omega M. Schwarzkopf, A. Konwinski, M. Abd-El-Malek, andJ. Wilkes.Omega: flexible, scalable schedulers for largecompute clusters.InProc. European Conf. on ComputerSystems (EuroSys), Prague, Czech Republic, 2013.
Kubernetes https://kubernetes.io/

五、文献引申

  1. 大规模系统的关键因素:J. Hamilton.On designing and deploying internet-scaleservices.InProc. Large Installation System AdministrationConf. (LISA), pages 231–242, Dallas, TX, USA, Nov. 2007.
  2. 系统性能评测的建模:D. G. Feitelson.Workload Modeling for Computer SystemsPerformance Evaluation.Cambridge University Press, 2014.
  3. 资源利用率相关指标和实验:A. Verma, M. Korupolu, and J. Wilkes.Evaluating jobpacking in warehouse-scale computing.InIEEE Cluster,pages 48–56, Madrid, Spain, Sept. 2014.
  4. 资源调度worst-fit:Y. Amir, B. Awerbuch, A. Barak, R. S. Borgstrom, andA. Keren.An opportunity cost approach for job assignmentin a scalable computing cluster.IEEE Trans. Parallel Distrib.Syst., 11(7):760–768, July 2000.
  5. 真实工作负载记录(数据集):J. Wilkes.More Google cluster data.http://googleresearch.blogspot.com/2011/11/more-google-cluster-data.html, Nov. 2011.
  6. 上述数据集的使用:
    1. 数据集分析:C. Reiss, A. Tumanov, G. Ganger, R. Katz, and M. Kozuch.Heterogeneity and dynamicity of clouds at scale: Googletrace analysis.InProc. ACM Symp. on Cloud Computing(SoCC), San Jose, CA, USA, Oct. 2012.
    2. 使用:O. A. Abdul-Rahman and K. Aida.Towards understandingthe usage behavior of Google cloud users: the mice andelephants phenomenon.InProc. IEEE Int’l Conf. on CloudComputing Technology and Science (CloudCom), pages272–277, Singapore, Dec. 2014.
    3. 使用:S. Di, D. Kondo, and W. Cirne.Characterization andcomparison of cloud versus Grid workloads.InInternationalConference on Cluster Computing (IEEE CLUSTER), pages230–238, Beijing, China, Sept. 2012.
    4. 使用:S. Di, D. Kondo, and C. Franck.Characterizing cloudapplications on a Google data center.InProc. Int’l Conf. onParallel Processing (ICPP), Lyon, France, Oct. 2013.
    5. 使用:Z. Liu and S. Cho.Characterizing machines and workloadson a Google cluster.InProc. Int’l Workshop on Schedulingand Resource Management for Parallel and DistributedSystems (SRMPDS), Pittsburgh, PA, USA, Sept. 2012.
  7. Borg后续
    1. 2016_Burns_Borg, Omega, and Kubernetes: Lessons learned from three container-management systems over a decade
    2. 2020_Tirmazi_Borg: the next generation