What is the activation function?

Convex

09 Jan 2025 • 1 min read

An activation function is a mathematical function applied to the output of a neuron in a neural network. It introduces non-linearity, enabling the network to learn and model complex patterns. Common activation functions include:

Sigmoid: Maps inputs to a range between 0 and 1. $\sigma(x) = \frac{1}{1 + e^{-x}}$.
Tanh (Hyperbolic Tangent): Maps inputs to a range between -1 and 1. Formula: $ tanh(x) = \frac{e^x - e^{-x}}{ex + e^{-x}} $.
ReLU (Rectified Linear Unit): Outputs the input directly if positive; otherwise, outputs 0. Formula: ( \text{ReLU}(x) = \max(0, x) ).
Leaky ReLU: Similar to ReLU but allows a small, non-zero gradient for negative inputs. Formula: ( \text{Leaky ReLU}(x) = \max(0.01x, x) ).
Softmax: Used in the output layer for multi-class classification, converting inputs into probabilities. Formula: ( \text{Softmax}(x_i) = \frac{e^{x_i}}{\sum_j e^{x_j}} ).

These functions help neural networks learn and generalize from data.