HyperAI

Hyperparameters

In machine learning, hyperparameters are given in advance to control the parameters of the learning process, while the values of other parameters (such as node weights) are obtained through training.Hyperparameters are configuration choices that determine how a machine learning algorithm learns from data; they are set by the user and are not learned during the training process. Examples of hyperparameters include the learning rate, the number of hidden layers in a neural network, the number of decision trees in a random forest, and the regularization parameter in linear regression.

Hyperparameters can be divided into model hyperparameters and algorithm hyperparameters. Model hyperparameters are mainly used for model selection and do not help learn the characteristics of the training set; while algorithm hyperparameters theoretically have no effect on the performance of the model, but affect the speed and quality of learning. A typical model hyperparameter is the topology and size of the neural network; while the learning rate and batch size (Batch size) and mini-batch size (Mini-Batch size) are typical algorithm hyperparameters.

Different model training algorithms require different hyperparameters, and some simple algorithms (such as ordinary least squares regression) do not require hyperparameters. Choosing appropriate hyperparameters is crucial because it directly affects the performance and behavior of the machine learning model. Setting hyperparameters too low can lead to underfitting, where the model fails to capture the underlying patterns in the data. Conversely, setting them too high can lead to overfitting, where the model becomes too complex and memorizes the training data instead of generalizing well to unseen data. Hyperparameter tuning is the process of finding the best hyperparameter combination for a given machine learning task. It is usually done through grid search, random search, or more advanced techniques such as Bayesian optimization. By systematically exploring different combinations of hyperparameters, researchers can identify the configuration that maximizes model performance on a validation set.

References

【1】https://zh.wikipedia.org/wiki/1T P3TB0_()

【2】https://encord.com/glossary/hyper-parameters-definition/