Very interesting and impressive research paper from Microsoft!
The sticking point is the hardware and computation necessary:
"We render a single training dataset for both landmark localization and face parsing, comprising 100,000 images at 512×512 resolution. It took 48 hours to render using 150 NVIDIA M60 GPUs. ...
our GPU cluster used approximately 3,000kWh of electricity ...
Assuming $1 per hour for an M60 GPU (average price across cloud providers), it would cost $7,200 to render 100,000 images. Though this seems expensive, real data collection costs can run much higher ..."
No comments:
Post a Comment