Key H2O.ai Solutions

Now that we know a bit about this solution, let’s understand a bit about H2O’s ecosystem of solutions and see the main characteristics and applications of each one.

H2O

This platform is the company’s flagship product, which they bet on both in the Desktop version for Machine Learning applications and also in the version for distributed processing for high data volumes.

This version has some ready-to-use on-the-shelf algorithms such as boosting, linear and logistic regression, tree-based algorithms, and some algorithms that use gradient as an optimization method. Nothing too complex, but quite functional.

This version is ideal to use if you want to learn more about the tool and don’t want to spend a lot of time installing or configuring things before you start applying the algorithms, or even for an initial test of the distributed processing functions on a cluster.

Below, a glimpse of the solution’s architecture:

h2oarch

Sparkling Water

This solution’s main characteristic is using its own algorithms, but with the advantage of leveraging all the distributed processing and Spark integration features. In this solution, all computation tasks can also be performed within Spark using Scala and with a web interface.

This solution is the most recommended for building Machine Learning applications, whether for microservices or even for embedding the entire algorithmic and computational part within a platform/system.

h2ospark h2ospark2

Deep Water

Deep Water is the solution focused on implementing Deep Learning using computational optimization with GPUs, leveraging frameworks like Tensor Flow, Theano, Caffe, among others.

In this case, the H2O platform will serve as the interface where all model training parameters (cross-validation, sampling, stopping criteria, hyperparameter tuning, etc.) will be incorporated, and the backend with Tensor Flow, Theano, etc., will handle the processing using GPUs.

Steam Steam is a platform that handles the entire link between machine learning models using H2O and development properties for incorporating Machine Learning models into applications, all collaboratively. This is very similar to Domino. Steam’s main advantage is that it abstracts all the engineering behind the task of deploying machine learning models into production, such as infrastructure management, auto-scaling infrastructure according to request load, and some Data Science tasks like model retraining; in addition to greatly reducing IT costs/investments.

steam

Now that we know H2O’s main products, we’ll soon have some posts with tutorials exploring this tool further.

Useful Links

Deep Water Github

Technical Documentation