Hyperparameters are the knobs that programmers tweak in machine learning algorithms. Most machine learning programmers spend a fair amount of time tuning / tweaking the HyperParameter
In this tutorial, you will learn about the following HyperParameters:
Gradient descent takes small steps to reduce the loss of a model. Gradient descent algorithms multiply the gradient by a scalar known as the learning rate (also sometimes called step size) to determine the next point. For example, if the gradient magnitude is 2.5 and the learning rate is 0.01, then the gradient descent algorithm will pick the next point 0.025 away from the previous point.
If you pick a learning rate that is too small, learning will take too long:
Conversely, if you specify a learning rate that is too large, the next point will perpetually bounce haphazardly across the bottom of the well like a quantum mechanics experiment gone horribly wrong:
When Learning rate is just right:
EPOCHs is one of the HyperParameters that you can tweak while training a model. Play around with the number of training cycles to get to the minimal loss from gradient descent
You can play around with the number of training examples that you want to use for each training cycle. For example, if you have 1000 training examples, you can choose to pick 100 random samples (batch size=100) for each EPOCH (Training cycle)