What is the activation function?
An activation function is a mathematical function applied to the output of a neuron in a neural network. It introduces non-linearity, enabling the network to learn and model complex patterns. Common activation functions include:
- Sigmoid: Maps inputs to a range between 0 and 1. $\sigma(x) = \frac{1}{1 + e^{-x}}$.
- Tanh (Hyperbolic Tangent): Maps inputs to a range between -1 and 1. Formula: \( tanh(x) = \frac{e^x - e{-x}}{ex + e^{-x}} \).
- ReLU (Rectified Linear Unit): Outputs the input directly if positive; otherwise, outputs 0. Formula: ( \text{ReLU}(x) = \max(0, x) ).
- Leaky ReLU: Similar to ReLU but allows a small, non-zero gradient for negative inputs. Formula: ( \text{Leaky ReLU}(x) = \max(0.01x, x) ).
- Softmax: Used in the output layer for multi-class classification, converting inputs into probabilities. Formula: ( \text{Softmax}(x_i) = \frac{e^{x_i}}{\sum_j e^{x_j}} ).
These functions help neural networks learn and generalize from data.