Model Training Productionization with H2O REST API and Requests in Python

FavoriteLoadingAdd to favorites

Building a model scoring application in production using H2O is greatly benefited from its ability to export models to MOJOs which allows scoring large datasets in Spark without native H2O dependency. However, the model training process is still hard to productionize given the nature of how data scientists work vs engineering requirements to build a reliable system. H2O’s Restful API provides a solution to standardize the model training performance by allowing the model training application to run from any environment without the need of installing H2O. The following script will demo how to make simple API calls to the H2O cluster (which could be run on a separate server) to import a sample csv file, parse the file, create a H2OFrame, train a gbm model, and export the model to a MOJO file (see https://github.com/jeffreyliu3230/h2o-api-demo)

Leave a Reply

Your email address will not be published. Required fields are marked *