Posted: 2/24/2020
As I am just reading this 2017 paper by Mikhail Belkin and other authors, I stumble about sentences like “The ability to achieve near-zero loss is provided by overparametrization. The number of parameters for most deep architectures is very large and often exceeds by far the size of the datasets used for training” (S1; emphasis added)
Overparameterization in deep learning is a much debated subject in artificial intelligence literature.
However, in my opinion, it is not the size of the dataset (i.e. the number of items (e.g. images, text samples) in the dataset) that really matters, but the number of relevant features each item in the dataset may expose to deep learning. From this perspective, the so called overparameterization is probably a misspecified issue or not really an issue at all for machine learning.
Sources (S):
No comments:
Post a Comment