Review: The best frameworks for machine learning and deep learning

TensorFlow, Spark MLlib, Scikit-learn, MXNet, Microsoft Cognitive Toolkit, and Caffe do the math

At a Glance

Over the past year I've reviewed half a dozen open source machine learning and/or deep learning frameworks: Caffe, Microsoft Cognitive Toolkit (aka CNTK 2), MXNet, Scikit-learn, Spark MLlib, and TensorFlow. If I had cast my net even wider, I might well have covered a few other popular frameworks, including Theano (a 10-year-old Python deep learning and machine learning framework), Keras (a deep learning front end for Theano and TensorFlow), and DeepLearning4j (deep learning software for Java and Scala on Hadoop and Spark). If you’re interested in working with machine learning and neural networks, you’ve never had a richer array of options.  

There's a difference between a machine learning framework and a deep learning framework. Essentially, a machine learning framework covers a variety of learning methods for classification, regression, clustering, anomaly detection, and data preparation, and it may or may not include neural network methods. A deep learning or deep neural network (DNN) framework covers a variety of neural network topologies with many hidden layers. These layers comprise a multistep process of pattern recognition. The more layers in the network, the more complex the features that can be extracted for clustering and classification.

Caffe, CNTK, DeepLearning4j, Keras, MXNet, and TensorFlow are deep learning frameworks. Scikit-learn and Spark MLlib are machine learning frameworks. Theano straddles both categories.

In general, deep neural network computations run an order of magnitude faster on a GPU (specifically an Nvidia CUDA general-purpose GPU, for most frameworks), rather than on a CPU. In general, simpler machine learning methods don't need the speedup of a GPU.

While you can train DNNs on one or more CPUs, the training tends to be slow, and by slow I'm not talking about seconds or minutes. The more neurons and layers that need to be trained, and the more data available for training, the longer it takes. When the Google Brain team trained its language translation models for the new version of Google Translate in 2016, they ran their training sessions for a week at a time, on multiple GPUs. Without GPUs, each model training experiment would have taken months.

Each of these packages has at least one distinguishing characteristic. Caffe's strength is convolutional DNNs for image recognition. Cognitive Toolkit has a separate evaluation library for deploying prediction models that works on ASP.Net websites. MXNet has excellent scalability for training on multi-GPU and multimachine configurations. Scikit-learn has a wide selection of robust machine learning methods and is easy to learn and use. Spark MLlib integrates with Hadoop and has excellent scalability for machine learning. TensorFlow has a unique diagnostic facility for its network graphs, TensorBoard.

To continue reading this article register now

7 inconvenient truths about the hybrid work trend
Shop Tech Products at Amazon