Ensemble learning is the process in which multiple models, such as classifiers, are generated and combined to solve a particular computational problem. The ensemble learning techniques combine the decisions from multiple models to improve the overall performance.

Ensemble learning is one of the most effective ways to build an efficient ML model. Ensemble learning techniques are primarily used to improve the performance of a model such as classification, prediction, regression.

Benefits of Ensemble Learning includes:-

  • Predictions are more accurate.
  • The combination of multiple simple models makes a very strong model which improves the overall performance of the machine learning model.

Different Ensemble techniques

Simple ensemble methods:

  • Max-Voting
  • Averaging
  • Weighted Averaging

Advanced ensemble methods:

  • Bagging
  • Boosting


Max-voting is generally used for classification problems. It is one of the simplest ways of combining the predictions from multiple ML algorithms. (Models)

In max voting, every base model makes a prediction and votes for every sample. Only the sample class with the highest votes is included in the final prediction.

For example, consider rating a movie (out of 5), assume more than 50% of the people gave it a rating of 3, while the rest gave it a 4. Since the majority gave a rating of 3, the final rating will be taken as 3.


Averaging is used for regression problems or can be used for calculating the probabilities in classification problems. Predictions are extracted from multiple models and the average of the predictions are used to make the final prediction.

For example, the averaging method takes the average of all the values.

Example: (3+4+5+4+3)/5 = 3.8

Weighted Averaging

Weighted Average is an extension of the averaging method. All models in Weighted Average are assigned different weights defining the importance of each model for prediction.

For example, consider the movie ratings given by two people are professional reviewers and the other three have no knowledge about rating the movie.

So, (3*0.18 + 4*0.23 + 5*0.23 + 4*0.23 + 3*0.18)

 0.18 and 0.23 are Weights.


Bagging also known as Bootstrap Aggregating is a technique that uses the subsets to get a clear idea of the distribution. The size of subsets created for bagging may be less than the original set in Bagging.

Bootstrapping is a sampling technique in which subsets are created for observations from the original dataset, with replacement. The size of the subsets is the same as the size of the original set in Bootstrapping.


Algorithm: Bagging


  • Training data D with correct labels  wI Ω= {w1,,,,,wC} representing C classes
  • Weak learning algorithm WeakLearn,
  • Integer T specifying number of iterations.
  • Percent (or fraction) P to create bootstrapped training data.

Do t = 1,,,,,,T

  • Take a bootstrapped replica Dt by randomly drawing P percent of D.
  • Call WeakLearn with Dt and receive the hypothesis (classifier) ht.
  • Add ht to the ensemble, E .



Boosting is a sequential process, where each subsequent model attempts to correct the errors of the previous model. The succeeding models are dependent on the previous model in the process.

Similar to bagging, boosting creates an ensemble of classifiers by resampling the data, which are then combined by majority voting.

The boosting algorithm combines a number of weak models(learners) to form a strong model(learners). The individual models would not perform that well on the entire dataset, but would work well for some part of the set. Thus, each model boosts the performance of the ensemble.


Stacking also known as Stacked Generalization is an ensemble learning technique that uses predictions from multiple models to build a new model.

It uses a meta-learning algorithm to learn how to best combine the predictions from multiple base machine learning algorithms.

The architecture of a stacking model involves two or more base models, and a meta-model that combines the predictions of the base models.

  • Base-Model: Models which fit on the training data and whose predictions are compiled.
  • Meta-Model: Model that learns how to combine the predictions of the base models in a better way.


Blending is an ensemble ML technique that uses a machine learning model to learn how to best combine the predictions from multiple contributing ensemble member models.

It is a little different from stacking where instead of fitting the meta-model on out-of-fold predictions made by the base model, it is fit on predictions made on a holdout dataset.

Blending follows the same approach as stacking but uses only a holdout dataset from the train set to make predictions. The holdout dataset and the predictions are used to build a model which is run on the test set.

Leave a Comment

Your email address will not be published. Required fields are marked *