Parameters and Hyperparameters

There are two types of parameters: model parameters and model hyperparameters. Model parameters are considered internal to the model and their values are learned from data during the training stage. Model hyperparameters are often considered external to the model and their values are normally set prior to training. More recently, trainable hyperparameters have been introduced, such as those for BatchNorm layers [46]. Typical parameters and hyperparameters by layers are summarized in Table 2-1.

Table 2-1 Parameters and Hyperparameters by Layers

Conv

Weights and biases

  • Number of filters (i.e., the depth of each layer)

  • Filter Size

  • Filter Stride

  • Amount of zero padding

Activation (e.g., ReLU)

None

Choice of the activation function (e.g., ReLU, sigmoid, tanh, etc.)

Normalization (e.g., BatchNorm)

None

BatchNorm parameters, γ,β [46]

Pooling

None

Pooling size and stride

FC

Weights and biases

Number of neurons

Dropout*

None

Probability of retaining or dropping

*Dropout, although generally referred to as a “layer” in practice, is an effective regularization technique to combat overfitting.

Last updated