Friday, May 29, 2020

Facebook AI Research applies Transformer architecture to streamline object detection models

This appears to be some interesting new research regarding computer vision coming out of Facebook. I have not yet reviewed it fully, but it sounds very promising.

"DETR matches the performance of state-of-the-art methods, such as the well-established and highly optimized Faster R-CNN baseline on the challenging COCO object detection data set, while also greatly simplifying and streamlining the architecture. DETR offers a simpler, more flexible pipeline architecture that requires fewer heuristics. Inference can be boiled down to 50 lines of simple Python code using elementary architectural blocks. Moreover, because Transformers have proven to be a powerful tool for dramatically improving performance of models in other domains, we believe additional performance gains and improved training efficiency will be possible with additional tuning."

Facebook AI Research applies Transformer architecture to streamline object detection models | VentureBeat Six members of Facebook AI Research (FAIR) tapped the popular Transformer neural network architecture to create end-to-end object detection AI, an approach they claim streamlines the creation of object detection models and reduces the need for handcrafted components. Named Detection Transformer (DETR), the model can recognize objects in an image in a single pass all at once.

Here is the corresponding blog post by Facebook:
End-to-end object detection with Transformers  To help bridge this gap, we are releasing Detection Transformers (DETR), an important new approach to object detection and panoptic segmentation. DETR completely changes the architecture compared with previous object detection systems. It is the first object detection framework to successfully integrate Transformers as a central building block in the detection pipeline.

Here is the corresponding preprint paper:
End-to-End Object Detection with Transformers

No comments: