Boosting is an ensemble modeling technique which attempts to create a strong model from the number of weak models.
Boosting refers to a general ensemble method which converts weak learners to strong learners.
Initially a model is built from the training data, then the second model is built which tries to correct the errors present in the initial model.
This process continues and more models are added until the training dataset is predicted correctly.
TYPES OF BOOSTING ALGORITHMS
Some of the boosting algorithms are:
- Gradient Boosting.
- XGBoost.(EXTREME Gradient Boosting)
AdaBoost is a Boosting algorithm developed for classification problems.
AdaBoost short for Adaptive Boosting is a very popular boosting technique which combines multiple weak models into a strong model.
- Here Box1 consists of 10 data points which has two kinds, they are: plus(+) and minus(-) and 5 are plus(+) and other 5 are minus(-). And each data point are assigned equal weight initially. The first model tries to classify the data points in Box1 and generates a vertical separator line.
- Box2 consists of 10 data points from the previous model in which the 3 wrongly classified plus(+) are weighted more so that the current model tries to classify these plus(+) correctly. This model generates a vertical separator line which correctly classifies the previous wrongly classified plus(+), but in this attempt, it wrongly classifies two minuses(-).
- Box3 consists of 10 data points from the previous model in which the 3 wrongly classified minus(-) are weighted more so that the current model tries to classify these minus(-) correctly. This model generates a horizontal separator line which correctly classifies the previous wrongly classified minus(-) data points.
- Box4 combines B1, B2 and B3 together in order to build a strong model for prediction which is better than the individual models used.
from sklearn.ensemble import AdaBoostClassifier
from sklearn.datasets import make_classification
X,Y = make_classification(n_samples=100, n_features=2, n_informative=2,
n_redundant=0, n_repeated=0, random_state=102)
clf = AdaBoostClassifier(n_estimators=4, random_state=0, algorithm=’SAMME’)
Gradient boosting algorithm trains many models sequentially. Each new model gradually minimizes the loss function of the whole model using Gradient Descent method.
Gradient boosting needs a differential loss function and works for both classification and regression.
from sklearn.ensemble import GradientBoostingClassifier
model = GradientBoostingClassifier()
from sklearn.ensemble import GradientBoostingRegressor
model = GradientBoostingRegressor(n_estimators=3,learning_rate=1)