10 - Logistic Regression#
10.1 - Introduction#
Logistic regression is a discriminative classification model \(p(y|\boldsymbol{x};\boldsymbol{\theta})\), where \(\boldsymbol{x}\) is a fixed-dimensional input vector, \(y \in \{ 1, \dots, C \}\) is the class label, and \(\boldsymbol{\theta}\) are the parameters. If \(C = 2\), this is known as binary logistic regression. If \(C > 2\), it is known as multinomial logistic regression or multiclass logistic regression.
10.2 - Binary logistic regression#
Binary logistic regression corresponds to the following model:
where \(\sigma\) is the sigmoid function, \(\boldsymbol{\omega}\) are the weights, \(b\) is the bias, and \(\boldsymbol{\theta} = (\boldsymbol{\omega}, b)\) are all the parameters. In other words,
where \(a = \boldsymbol{\omega^T x} + b = \log(\frac{p}{1-p})\) is the log-odds, and \(p = p(y=1|\boldsymbol{x};\boldsymbol{\theta})\). In ML, the quantity \(a\) is usually called the logit or the pre-activation.