At CLOUD NATIVE+ OPEN SOURCE Virtual Summit China 2020, Volcano, a container batch computing project led by Huawei CLOUD NATIVE team, officially released version 1.0, indicating that Volcano project has begun to mature and stabilize.

Volcano Project Introduction

Volcano is a cloud native batch computing engine based on Kubernetes. Based on huawei cloud’s profound business accumulation in AI and big data, it complishes Kubernetes’ shortcomings in batch computing task scheduling and scheduling for AI, big data and high performance computing, and supports multiple computing forces such as Kunpeng, Ceng and X86. Mainstream computing frameworks such as TensorFlow, Spark and Huawei MindSpore are enabled upwards to enable data scientists and algorithm engineers to fully enjoy efficient computing and extreme experience brought by cloud native technology.

Volcano architecture Diagram

As Kubernetes becomes clear as the next generation infrastructure for AI, big data and high-performance batch computing, more and more enterprises are putting forward higher requirements for Kubernetes in deep learning, scientific computing, high-performance rendering and other aspects.

However, as a universal container solution, Kubernetes still has some gaps with business demands, mainly reflected in:

  1. The native scheduling function of K8s cannot meet the computing requirements
  2. K8s job management ability cannot meet the complex demands of AI training
  3. Data management, the lack of computing side data caching ability, data location awareness and other functions
  4. Time-sharing is unavailable in resource management, resulting in low utilization
  5. Hardware heterogeneity capability is weak

Volcano is born based on these pain points, focusing on optimization in scheduling, job management, data management and resource management.

  1. Enhanced task scheduling capabilities, such as fair-share, gang-scheduling
  2. Further improved job management capabilities such as multiple Pod Template capabilities and more flexible error handling
  3. Increase the data cache on the computing side to improve the efficiency of data transmission and reading
  4. The multi-dimensional comprehensive scoring mechanism is introduced to achieve more efficient management and allocation of resources
  5. Multiple computing power support: x86, Kunpeng and Ceng

New features in Volcano V1.0

The core concepts and key features of Volcano V1.0 mainly include the following points:

  1. Core concepts such as Queue, PodGroup and Volcano Job have been implemented
  2. Supports Binpack, ConF, DRF, Gang, Preempt, Reclaim, Priority, and Proportion scheduling policies
  3. Supports various interaction modes, such as Rest APIS and CLI
  4. Complete seamless connection with mainstream HPC frameworks such as Spark, Argo, MPI, Flink, Mxnet, Paddlepaddle, Tensorflow and MindSpore
  5. Supports Job life-cycle management and dynamic capacity expansion and contraction
  6. Supports GPU heterogeneity and sharing
  7. Complete Golangci-Lint Check, E2E to build enhanced code quality and stability

In addition to the above features, Volcano remains consistent with the latest version of the Kubernetes community and Golang.

Volcano community and ecological construction progress

After more than a year of development, The community and ecological construction of Volcano has entered a fast lane. So far, the community and ecological construction has made the following achievements:

  1. Community contributors 80+
  2. 15+ participating organizations in community contribution, including Huawei, Baidu, Tencent, AWS, IBM, Oracle, etc
  3. Get Star 1100+, Fork 220+
  4. Code base 7, Release 6
  5. Issue 320+, PR 590+
  6. Support for Spark, Argo, MPI, Flink, Mxnet, Paddlepaddle, Tensorflow, MindSpore, Cromwell and other 10+ mainstream computing frameworks has been completed
  7. Huawei cloud CCE (Cloud container Engine), CCI (Cloud container instance), ModelArts and other cloud services have used Volcano integration as infrastructure base for commercial use. Service fields have covered AI, big data application, genetic computing, batch processing and other scenarios, and have realized deep integration with Processors of Huawei Kunpeng and Centerm. It can dispatch and deliver up to 1000 containers per second, making it a high-performance and cost-effective batch computing solution.

Learn more about Volcano

To learn more about the Volcano, check out the following resources:

The Volcano’s official website:

volcano.sh/

Making:

github.com/volcano-sh

Volcano’s brief introduction:

Github.com/volcano-sh/…

Volcano design.

Github.com/volcano-sh/…

Volcano Roadmap:

Github.com/volcano-sh/…

Volcano Community Communication wechat Group:

Volcano CN

It should be a future

With the release of Volcano V1.0, the integration of Volcano community construction and upstream and downstream ecology will be closer. The commercial application based on Volcano will also greatly promote AI, big data, scientific computing, rendering and other fields to fully enjoy the great convenience and extreme experience brought by cloud computing, and help enterprises to enter a new height in digital transformation.

Looking to the future, Huawei Cloud will continue to work in the cloud native field, continue to lead innovation, prosperity ecology, and help all industries to the road of rapid and intelligent development.

Click to follow, the first time to learn about Huawei cloud fresh technology ~