# Random Forest

The random forest is a very common machine learning algorithm that is used by millions around the world. The algorithm works by using multiple decision trees to form a forest of decision trees. These trees are merged together, forming a ‘forest’,  to ultimately output a single result. The reason why the random forest is so popular is due to its flexibility and ease of use, it works extremely well with both classification and regression problems.

As mentioned earlier, random forests are made up of multiple decision trees. Decision trees are formulated around one basic question. From this initial questions, follow-on questions can be asked and so forth until there are multiple questions being asked leading to a single result. The follow-on questions are called decision nodes in the decision tree and are primarily intended to specify and split the data to reach a reasonable result. The path taken is usually established by whether the data agrees with question or seeks an alternate path. The final decisions of all paths are labelled as the leaf node. These trees seek to find the best route to split the data to from accurate subsets which is done by training the tree using the Classification and Regression Tree (CART) algorithm.

Any random forest algorithm is comprised of three main hyperparameters that need to determined and set before any training or testing of the model can be started. These three parameters are node size, number of trees and number of features sampled. Node Size refers to the amount of splitting that can be done from a single node. More splitting means more options for data to be classified as meaning more sub-sets are created. Number of trees is fairly obvious as it refers to the amount of different unrelated decision trees in the random forest algorithm, more trees present, the large the classifier. Finally, the number of features sampled plays an integral role in the random forest algorithms as it helps determine the depth of the tree. Apart from these 3 main hyperparameters, there are other hyperparameters that need to be tuned in order to maximise the output and accuracy of the model. These are case-to-case and would require a sound understanding of the dataset and potentially running the random tree algorithms a few times through the data set.