Key Solutions from H2O.ai

Now that we know a bit about this solution, let’s understand the H2O solutions ecosystem and explore the main characteristics and applications of each.

H2O

This platform is the company’s flagship product, which they leverage for both desktop Machine Learning applications and distributed processing for high-volume data.

This version includes ready-to-use on the shelf algorithms such as boosting, linear and logistic regression, tree-based algorithms, and some algorithms that use gradient for optimization. Nothing overly complex, but very functional.

This version is ideal if you want to get to know the tool better without spending too much time on installation or configuration before applying algorithms, or even for an initial test of distributed cluster processing functions.

Below is a glimpse of the solution’s architecture:

h2oarch

Sparkling Water

The main characteristic of this solution is its use of H2O’s own algorithms, but with the advantage of leveraging all of Spark’s distributed processing and integration features. In this solution, all computation tasks can also be performed within Spark using Scala and accessed via a Web interface.

This solution is most recommended for building Machine Learning applications for microservices or even for embedding algorithmic and computational capabilities into a platform/system.

h2ospark h2ospark2

Deep Water

Deep Water is the solution focused on implementing Deep Learning using computational optimization with GPUs, supporting frameworks like Tensor Flow, Theano, Caffe among others.

In this scenario, H2O’s platform acts as the interface where all model training parameters (cross-validation, sampling, stopping criteria, hyperparameter tuning, etc.) are managed, while the backend using Tensor Flow, Theano, etc., handles the processing using GPUs.

Steam Steam is a platform that connects machine learning models using H2O with development features for incorporating ML models into applications. It operates collaboratively, similar to Domino. Steam’s primary advantage is abstracting the engineering complexities involved in deploying machine learning models into production, such as infrastructure management, auto-scaling based on request load, and Data Science tasks like model retraining, while significantly reducing IT costs and investments.

steam

Now that we know the main H2O products, we will soon have some posts with tutorials exploring this tool further.

Useful Links

Deep Water Github

Technical Documentation