Machine Learning in Stata Using H2O Reference Manual

Publisher: Stata Press
Copyright: 2025
ISBN-13: 978-1-59718-451-9
Pages: 361

Suggested citation:
StataCorp. 2025. Stata 19 Machine Learning in Stata Using H2O Reference Manual. College Station, TX: Stata Press.


Supplemental materials

Table of contents
Intro Introduction to machine learning and ensemble decision trees
h2oml Introduction to commands for Stata integration with H2O machine learning


H2O setup Prepare data for H2O analysis in Stata


h2oml gbm Gradient boosting machine for regression and classification
h2oml gbbinclass Gradient boosting binary classification
h2oml gbmulticlass Gradient boosting multiclass classification
h2oml gbregress Gradient boosting regression


h2oml rf Random forest for regression and classification
h2oml rfbinclass Random forest binary classification
h2oml rfmulticlass Random forest multiclass classification
h2oml rfregress Random forest regression


h2oml postestimation Postestimation tools for h2oml gbm and h2oml rf+


h2omlest Store and restore H2OML estimation results


h2omlestat aucmulticlass Display AUC and AUCPR after multiclass classification
h2omlestat confmatrix Display confusion matrix
h2omlestat cvsummary Display cross-validation summary
h2omlestat gridsummary Display grid-search summary
h2omlestat hitratio Display hit-ratio table
h2omlestat metrics Display performance metrics
h2omlestat threshmetric Display threshold-based metrics for binary classification


h2omlexplore Explore models after grid search


h2omlgof Compare goodness of fit for machine learning models


h2omlgraph ice Produce individual conditional expectation plot
h2omlgraph pdp Produce partial dependence plot
h2omlgraph prcurve Produce precision–recall curve plot
h2omlgraph roc Produce ROC curve plot
h2omlgraph scorehistory Produce score history plot
h2omlgraph shapsummary Produce SHAP beeswarm plot
h2omlgraph shapvalues Produce SHAP values plot for individual observations
h2omlgraph varimp Produce variable importance plot


h2omlpostestframe Specify frame for postestimation analysis


h2omlselect Select model after grid search


h2omltree Save decision tree DOT file and display rule set
DOT extension Handling DOT files


encode_option Encoding schemes for categorical predictors
metric_option Classification and regression metrics


H2O option mapping Mapping of H2OML estimation options to H2O
H2O reproducibility Reproducibility in H2O


Glossary


Subject and author index