In Tan-h (Tangent hyperbolic function) which is non-linear when you pass any value from – infinity to + infinity it will always take up values between –1 and +1 the plot below shows how a tan-h function looks like. At – infinity the value of tan h becomes –1 and at infinity it becomes 1. The tanh function is mainly used classification between two classes. 


In this equation the numerator is e^x – e^ -x and the denominator is e^ x + e^ -x. 

Tan-h function is also called a shifted version of the sigmoid function. Like sigmoid, tanh also has the vanishing gradient problem. 

f(x) = tanh(x) = 2/(1 + e-2x) – 1 
tanh(x) = 2 * sigmoid(2x) – 1 


It is used in hidden layers of a neural network, its values lies between -1 to 1 hence the mean for the hidden layer comes out be 0 or very close to it, hence helps to centralize the data by bringing mean close to 0. This makes learning for the next layer much easier. 

The derivatives of the tan-h are larger than the derivatives of the sigmoid. In other words, you minimize your cost function faster if you use tanh as an activation function. 

The range is between -1 and 1 compared to 0 and 1, makes the function to be more convenient for neural networks. 


  • Tanh functions — error surface can be very flat at origin. So, initializing very small weights should be avoided. 
  • Vanishing gradient problem and sometimes Exploding gradient problem. 
  • May show different results during convergence based on variation in data. 

Leave a Comment

Your email address will not be published. Required fields are marked *