原文链接:Large-scale cluster management at Google with Borg
一、主旨
Borg是Google内部从21世纪初就开始研发的大规模集群管理系统,属于底层基础架构,服务上层业务应用的部署和运行。
Borg对K8s有深远意义,从某种意义上可以说Borg是K8s的前身,但两者并不一样,且时至今日在Google内部Borg也并没有被K8s替代,应当也没有被替代的趋势。
二、内容
三、对K8s的影响
- 应用级别的多构件关联
- Borg中只有job概念可以串联多构件
- K8s中可通过label来松散串联多构件
- 网络地址空间
- Borg中所有任务共用节点IP和端口,容器端口直接映射节点端口,空间狭小
- K8s中pod自带网络地址空间,上层还有svc可以封装和控制是否映射到节点端口
- 用户倾向
- Borg最初目的是优先服务大团队
- K8s社区决定玩家结构
- 多容器的调度单元——pod
- Alloc in Borg: app + logsaver + dataloader 模式,分段开发维护
- Pod in K8s: init containers + helper containers/sidecars
- 运维的可观测辅助
- Debugging information
- Events
- Master即内核——Borgmaster in Borg = kube-apiserver in K8s
四、同类系统
SysName | Ref |
---|---|
Apache Mesos | B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi,A. Joseph, R. Katz, S. Shenker, and I. Stoica.Mesos: aplatform for fine-grained resource sharing in the data center.InProc. USENIX Symp. on Networked Systems Design andImplementation (NSDI), 2011 |
YARN | V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal,M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, S. Seth,B. Saha, C. Curino, O. O’Malley, S. Radia, B. Reed, andE. Baldeschwieler.Apache Hadoop YARN: Yet AnotherResource Negotiator.InProc. ACM Symp. on CloudComputing (SoCC), Santa Clara, CA, USA, 2013. |
Tupperware | A. Narayanan.Tupperware: containerized deployment atFacebook.http://www.slideshare.net/dotCloud/tupperware-containerized-deployment-at-facebook,June 2014. |
Apache Aurora (retired) | Apache Aurora.http://aurora.incubator.apache.org/, 2014. |
Autopilot | https://aurora.apache.org/ |
Quincy (on Borg) | M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar,and A. Goldberg.Quincy: fair scheduling for distributedcomputing clusters.InProc. ACM Symp. on OperatingSystems Principles (SOSP), 2009. |
Cosmos | P. Helland.Cosmos: big data and big challenges.http://research.microsoft.com/en-us/events/fs2011/helland\_cosmos\_big\_data\_and\_big\_challenges.pdf, 2011. |
Apollo | E. Boutin, J. Ekanayake, W. Lin, B. Shi, J. Zhou, Z. Qian,M. Wu, and L. Zhou.Apollo: scalable and coordinatedscheduling for cloud-scale computing.InProc. USENIXSymp. on Operating Systems Design and Implementation(OSDI), Oct. 2014. |
Fuxi | Z. Zhang, C. Li, Y. Tao, R. Yang, H. Tang, and J. Xu.Fuxi: afault-tolerant resource management and job schedulingsystem at internet scale.InProc. Int’l Conf. on Very LargeData Bases (VLDB), pages 1393–1404. VLDB EndowmentInc., Sept. 2014. |
Omega | M. Schwarzkopf, A. Konwinski, M. Abd-El-Malek, andJ. Wilkes.Omega: flexible, scalable schedulers for largecompute clusters.InProc. European Conf. on ComputerSystems (EuroSys), Prague, Czech Republic, 2013. |
Kubernetes | https://kubernetes.io/ |
五、文献引申
- 大规模系统的关键因素:J. Hamilton.On designing and deploying internet-scaleservices.InProc. Large Installation System AdministrationConf. (LISA), pages 231–242, Dallas, TX, USA, Nov. 2007.
- 系统性能评测的建模:D. G. Feitelson.Workload Modeling for Computer SystemsPerformance Evaluation.Cambridge University Press, 2014.
- 资源利用率相关指标和实验:A. Verma, M. Korupolu, and J. Wilkes.Evaluating jobpacking in warehouse-scale computing.InIEEE Cluster,pages 48–56, Madrid, Spain, Sept. 2014.
- 资源调度worst-fit:Y. Amir, B. Awerbuch, A. Barak, R. S. Borgstrom, andA. Keren.An opportunity cost approach for job assignmentin a scalable computing cluster.IEEE Trans. Parallel Distrib.Syst., 11(7):760–768, July 2000.
- 真实工作负载记录(数据集):J. Wilkes.More Google cluster data.http://googleresearch.blogspot.com/2011/11/more-google-cluster-data.html, Nov. 2011.
- 上述数据集的使用:
- 数据集分析:C. Reiss, A. Tumanov, G. Ganger, R. Katz, and M. Kozuch.Heterogeneity and dynamicity of clouds at scale: Googletrace analysis.InProc. ACM Symp. on Cloud Computing(SoCC), San Jose, CA, USA, Oct. 2012.
- 使用:O. A. Abdul-Rahman and K. Aida.Towards understandingthe usage behavior of Google cloud users: the mice andelephants phenomenon.InProc. IEEE Int’l Conf. on CloudComputing Technology and Science (CloudCom), pages272–277, Singapore, Dec. 2014.
- 使用:S. Di, D. Kondo, and W. Cirne.Characterization andcomparison of cloud versus Grid workloads.InInternationalConference on Cluster Computing (IEEE CLUSTER), pages230–238, Beijing, China, Sept. 2012.
- 使用:S. Di, D. Kondo, and C. Franck.Characterizing cloudapplications on a Google data center.InProc. Int’l Conf. onParallel Processing (ICPP), Lyon, France, Oct. 2013.
- 使用:Z. Liu and S. Cho.Characterizing machines and workloadson a Google cluster.InProc. Int’l Workshop on Schedulingand Resource Management for Parallel and DistributedSystems (SRMPDS), Pittsburgh, PA, USA, Sept. 2012.
- Borg后续