XGBOOST stands for eXtreme Gradient Boosting. A big brother of the earlier AdaBoost, XGB is a supervised learning algorithm that uses an ensemble of adaptively boosted decision trees. For those unfamiliar with adaptive boosting algorithms, here’s a 2-minute explanation video and a written tutorial. Although XGBOOST often performs well in predictive tasks, the training process can be quite time-consuming (similar to other bagging/boosting algorithms (e.g., random forest)).

In a recent blog, Analytics Vidhya compares the inner workings as well as the predictive accuracy of the XGBOOST algorithm to an upcoming boosting algorithm: Light GBM. The blog demonstrates a stepwise implementation of both algorithms in Python. The table below reflects the main conclusion of the comparison: Although the algorithms are comparable in terms of their predictive performance, light GBM is much faster to train. With continuously increasing data volumes, light GBM, therefore, seems the way forward.

Light GBM is also available in R:

devtools::install_github("Microsoft/LightGBM", subdir = "R-package")

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s