13 - Neural Networks for Tabular Data
13 - Neural Networks for Tabular Data
13.2 - Multilayer perceptrons (MLPs)
13.2.2 - Differentiable MLPs
13.2.3 - Activation functions
13.2.5 - The importance of depth
13.2.6 - The “deep learning revolution”
13.2.7 - Connections with biology
13.3 - Backpropagation
13.3.1 - Forward vs reverse mode differentiation
13.3.2 - Reverse mode differentiation for multilayer perceptrons
13.3.3 - Vector-Jacobian product for common layers
13.3.4 - Computation graphs
13.4 - Training neural networks
13.4.1 - Tuning the learning rate
13.4.2 - Vanishing and exploding gradients
13.4.3 - Non-saturating activation functions
13.4.4 - Residual connections
13.4.5 - Parameter initialization
13.4.6 - Parallel training
13.5 - Regularization
13.5.5 - Bayesian neural networks
13.5.6 - Regularization effects of (stochastic) gradient descent *
13.5.7 - Over-parameterized models
13.6 - Other kinds of feedforward networks *
13.6.1 - Radial basis function networks
13.6.2 - Mixtures of experts