Parameters and Hyperparameters
There are two types of parameters: model parameters and model hyperparameters. Model parameters are considered internal to the model and their values are learned from data during the training stage. Model hyperparameters are often considered external to the model and their values are normally set prior to training. More recently, trainable hyperparameters have been introduced, such as those for BatchNorm layers [46]. Typical parameters and hyperparameters by layers are summarized in Table 2-1.
Table 2-1 Parameters and Hyperparameters by Layers
Conv
Weights and biases
Number of filters (i.e., the depth of each layer)
Filter Size
Filter Stride
Amount of zero padding
Activation (e.g., ReLU)
None
Choice of the activation function (e.g., ReLU, sigmoid, tanh, etc.)
Normalization (e.g., BatchNorm)
None
BatchNorm parameters, γ,β [46]
Pooling
None
Pooling size and stride
FC
Weights and biases
Number of neurons
Dropout*
None
Probability of retaining or dropping
*Dropout, although generally referred to as a “layer” in practice, is an effective regularization technique to combat overfitting.
Last updated