Apache SINGA
Apache SINGA is an Apache top-level project for developing an open source machine learning library. It provides a flexible architecture for scalable distributed training, is extensible to run over a wide range of hardware, and has a focus on health-care applications.
Developer(s) | Apache Software Foundation |
---|---|
Initial release | October 8, 2015 |
Stable release | 3.1.0
/ October 30, 2020 |
Written in | C++, Python, Java |
Operating system | Linux, macOS, Windows |
License | Apache License 2.0 |
Website | singa |
History
The SINGA project was initiated by the DB System Group at National University of Singapore in 2014, in collaboration with the database group of Zhejiang University, in order to support complex analytics at scale, and make database systems more intelligent and autonomic.[1] It focused on distributed deep learning by partitioning the model and data onto nodes in a cluster and parallelize the training.[2][3] The prototype was accepted by Apache Incubator in March 2015, and graduated as a top-level project in October 2019. Seven versions have been released as shown in the following table. Since V1.0, SINGA is general to support traditional machine learning models such as logistic regression. Companies like NetEase,[4] yzBigData,Shentilium and others are using SINGA for their applications, including healthcare[5] and finance.
Version | Original release date | Latest version | Release date | |
---|---|---|---|---|
3.1.0 | 2020-10-30 | 3.1.0 | 2020-10-30 | |
3.0.0 | 2020-04-20 | 3.0.0 | 2020-04-20 | |
2.0.0 | 2019-04-20 | 2.0.0 | 2019-04-20 | |
1.2.0 | 2018-06-06 | 1.2.0 | 2018-06-06 | |
1.1.0 | 2017-02-12 | 1.1.0 | 2017-02-12 | |
1.0.0 | 2016-09-08 | 1.0.0 | 2016-09-08 | |
0.3.0 | 2016-04-20 | 0.1.0 | 2016-04-20 | |
0.2.0 | 2016-01-14 | 0.2.0 | 2016-01-14 | |
0.1.0 | 2015-10-08 | 0.1.0 | 2015-10-08 | |
Legend: Old version Older version, still maintained Latest version Latest preview version |
Software Stack
SINGA's software stack includes three major components, namely, core, IO and model. The following figure illustrates these components together with the hardware. The core component provides memory management and tensor operations; IO has classes for reading (and writing) data from (to) disk and network; The model component provides data structures and algorithms for machine learning models, e.g., layers for neural network models, optimizers/initializer/metric/loss for general machine learning models.
Benchmark for Distributed training
Workload: we use a deep convolutional neural network, ResNet-50 as the application. ResNet-50 has 50 convolution layers for image classification. It requires 3.8 GFLOPs to pass a single image (of size 224x224) through the network. The input image size is 224x224.
Hardware: we use p2.8xlarge instances from AWS, each of which has 8 Nvidia Tesla K80 GPUs, 96 GB GPU memory in total, 32 vCPU, 488 GB main memory, 10 Gbit/s network bandwidth.
Metric: we measure the time per iteration for different number of workers to evaluate the scalability of SINGA. The batch size is fixed to be 32 per GPU. Synchronous training scheme is applied. As a result, the effective batch size is $32N$, where N is the number of GPUs. We compare with a popular open source system which uses the parameter server topology. The first GPU is selected as the server. In the following figure, bars are for the throughput and lines are for the communication cost.
Rafiki
Rafiki[6] is a sub module of SINGA for providing machine learning analytics service.
Using SINGA
To get started with SINGA, there are some tutorials available as Jupyter notebooks. The tutorials cover the following:
- Core classes
- Model classes
- Linear Regression
- Multi-layer Perceptron
- Convolutional Neural Network (CNN)
- Recurrent Neural Networks (RNN)
- Restricted Boltzmann Machine (RBM)
There is also an online course about SINGA.
See also
- List of Apache Software Foundation projects
- Comparison of deep learning software
References
- Wei, Wang; Meihui, Zhang; Gang, Chen; H.V., Jagadish; Beng Chin, Ooi; Kian-Lee, Tan; Sheng, Wang (June 2016). "Database Meets Deep Learning: Challenges and Opportunities". SIGMOD Record. 45 (2): 17–22. arXiv:1906.08986. doi:10.1145/3003665.3003669.
- Ooi, Beng Chin; Tan, Kian-Lee; Sheng, Wang; Wang, Wei; Cai, Qingchao; Chen, Gang; Gao, Jinyang; Luo, Zhaojing; Tung, Anthony K. H.; Wang, Yuan; Xie, Zhongle; Zhang, Meihui; Zheng, Kaiping (2015). "SINGA: A distributed deep learning platform" (PDF). ACM Multimedia. doi:10.1145/2733373.2807410. Retrieved 8 September 2016.
- Wei, Wang; Chen, Gang; Anh Dinh, Tien Tuan; Gao, Jinyang; Ooi, Beng Chin; Tan, Kian-Lee; Sheng, Wang (2015). "SINGA: putting deep learning in the hands of multimedia users" (PDF). ACM Multimedia. doi:10.1145/2733373.2806232. Retrieved 8 September 2016.
- 网易. "网易携手Apache SINGA角逐人工智能新战场_网易科技". tech.163.com. Retrieved 2017-06-03.
- "New app allows pre-diabetics to use photos of their meal to check if it is healthy". www.straitstimes.com. Retrieved 6 April 2019.
- Wang, Wei; Gao, Jinyang; Zhang, Meihui; Sheng, Wang; Chen, Gang; Khim Ng, Teck; Ooi, Beng Chin; Shao, Jie; Reyad, Moaz (2018). "Rafiki" (PDF). Proceedings of the VLDB Endowment. 12 (2): 128–140. arXiv:1804.06087. Bibcode:2018arXiv180406087W. doi:10.14778/3282495.3282499. Retrieved 9 January 2019.