Just finished reading this well written and well done research paper by Facebook researchers. This paper has a good chance to have some impact!
"... We gradually “modernize” a standard ResNet toward the design of a vision Transformer, and discover several key components that contribute to the performance difference along the way. ...
By itself, this enhanced training recipe increased the performance of the ResNet-50 model from 76.1% [1] to 78.8% (+2.7%), implying that a significant portion of the performance difference between traditional ConvNets and vision Transformers may be due to the training techniques ..."
No comments:
Post a Comment