Model Training and Evaluation

Overfitting and Underfitting

By Pircalabu StefanPublished about a year ago • 6 min read

Training your network the right way can make the difference between having a high-performance model, and having a failure of a model. There are several issues that may arise before and after training your network, that impairs its overall performance. However, the most well-known and also most prevalent issues are by far overfitting and underfitting. In this article, I will describe both of them shortly, and then give you some useful ways in which you can try and fix them or prevent them from happening in your neural network model. Let's get into it!

Overfitting

Overfitting occurs in machine learning when a model is too complex for the underlying data and learns patterns in the training data that do not generalize to new, unseen data. This can result in poor performance on previously unseen data because the model is unable to accurately predict or classify new examples.

Overfitting can also happen if the learning rate of your network is too small, and you let the model train for too long. Also, if the learning rate is too small, your network may also get stuck in local minima, which will cause it to not perform the best it can.

There are a variety of ways in which you can prevent overfitting, or fix it if your model is already overfitted, which I will describe below:

Model complexity should be reduced: Using a model with a suitable complexity for the amount of data available is one way to avoid overfitting. This means that in case your model is too big for the data you have, you can simplify your model. This can be accomplished by reducing the depth or width of the model, or by removing hidden layers.

Regularization is a technique for constraining the model so that it does not overfit. L1 and L2 regularization techniques, for example, can be applied to a variety of models.

Dropout is a regularization technique in which a percentage of the input units are set to zero at random during training. Forcing the model to share information among all units prevents overfitting.

The technique of early stopping involves interrupting the training process when the model no longer improves on a validation dataset. Limiting the number of training iterations can help prevent the model from becoming overfitted.

Use a smaller training dataset to reduce overfitting: Limiting the amount of data from which the model can learn can help reduce overfitting.

Using a hold-out test set during model training and evaluation is another way to avoid overfitting. This entails reserving a subset of the data for use as a test set, which is only used to assess the model's performance after training is complete. This can help ensure that the model does not overfit the training data and can generalize to new examples.

Data augmentation is a technique that generates additional training data by applying random transformations to existing training data. Providing the model with a larger and more diverse training dataset can help it generalize better. Also, by using adversarial samples, you can also bulletproof your model from adversarial attacks.

Underfitting

When a model is too simple to capture the underlying structure of the data, it is said to be underfitting. To put it another way, the model cannot learn the relationships between the input features and the output targets. As a result, the model performs poorly on training data and cannot generalize to new data.

A model may underfit the training data for a variety of reasons. One reason is that the model lacks capacity, which means it lacks enough parameters to adequately capture the data's complexity. Another reason is that the model was not trained for a long enough period of time or with an appropriate optimization algorithm. Finally, the model may be suffering from overregularization, which means that it has been overly constrained to avoid overfitting.

There are several ways to fix underfitting:

Increase the model complexity: Adding more parameters to the model can help it capture the complexity of the data. This can be achieved by increasing the depth or width of the model, or by adding more hidden layers.

Train the model for longer: Training the model for longer can help it learn the relationships between the input features and the output targets more accurately. This can be achieved by using a larger training dataset or by using a larger batch size.

Use a different optimization algorithm: Different optimization algorithms can have different properties and may be better suited for certain types of models. For example, some optimization algorithms are more effective at preventing overfitting than others.

Reduce regularization: Regularization is a technique used to prevent overfitting by constraining the model. However, excessive regularization can lead to underfitting. Reducing the amount of regularization applied to the model may help it better capture the complexity of the data.

Try different model architectures: Different model architectures may be more or less effective at capturing the complexity of the data. Experimenting with different architectures can help identify a model that performs better.

Tune the hyperparameters: Hyperparameters are the parameters that control the behavior of the model. Tuning these parameters can have a significant impact on the model's performance.

Use more data: Increasing the amount of training data can help the model better capture the underlying structure of the data. This is especially useful if the current dataset is small or if it is not representative of the problem at hand.

Unbalanced datasets: If a dataset is unbalanced, meaning it has significantly more examples of one class than another, this can affect the model's ability to learn and generalize.

Evaluation metrics: Choosing the right evaluation metric for a given task is important, as different metrics can produce different results. It is important to carefully consider which metric is most appropriate for the task at hand.

Data leakage: This occurs when information from the test set is leaked into the training process, causing the model to perform artificially well on the test set. This can be caused by improper data preprocessing or by using an evaluation method that is susceptible to data leakage.

Poor quality data: If the data used to train and evaluate a model is of poor quality, this can lead to inaccurate and unreliable results. Fix your dataset!

Conclusion

Those are the most important factors that contribute to the overfitting or underfitting of your network. You should take them into account before each model that you train since it can save you a lot of headaches and training costs. Hopefully, now you will be better armed to prevent these issues from making their way into your network. But we all know that we can never get fully rid of our 2 friends, overfitting and underfitting.

As always, keep learning and stay safe!

startup tech news how to hackers future

About the Creator

Pircalabu Stefan

I love writing about life and technology. Really passionate about all technological advances and Artificial Intelligence!

Reader insights

Be the first to share your insights about this piece.

How does it work?

Add your insights

Comments

There are no comments for this story

Be the first to respond and start the conversation.

Keep reading

More stories from Pircalabu Stefan and writers in 01 and other communities.