**INTRODUCTION**

- Both Cost and Loss function are also known as error functions.
- Loss function is a method of evaluating “how well your algorithm models your dataset”. If your predictions are totally off, your loss function will output a higher number. If they’re pretty good, it’ll output a lower number. As you tune your algorithm to try and improve your model, your loss function will tell you if you’re improving or not. ‘Loss’ helps us to understand how much the predicted value differ from actual value.
- Cost function is a measure of error between what value your model predicts and what the value actually is. For example, say we wish to predict the value (yi) for data point (xi).

**Loss Function**

_{In regression models, our neural network will have }_{one output node}_{ for every continuous value we are trying to predict. Regression losses are calculated by performing direct comparisons between the output value and the true value. The most popular loss function we use for regression models is the }_{mean squared error }_{loss function. In this we simply calculate the square of the difference between Y and Y-pred and average this over all the data. Suppose there are n data points:}

_{Loss function is a method of evaluating “how well your algorithm models your dataset”. If your predictions are totally off, your loss function will output a higher number. Cost function is over the entire training set (or mini-batch for mini-batch gradient descent).}

_{A cost function is a measure of error between what value your model predicts and what the value actually is. The loss function (or error) is for single training example also cost function can contain regularization terms in addition to loss function but not always.}

** Cost function (J) = 1/m (Sum of Loss error for ‘m’ examples)**

The cost function is the average of the losses. You first calculate the loss, one for each data point, based on your prediction and your ground truth label. Then, you average these losses which corresponds to your cost. For one training cycle or epoch, you calculate n losses for n training instances but only one cost which is used for your parameter update.