Tuesday, October 24, 2023

New IBM chip architecture speeds up AI

IBM has come a long way from mainframe computers in the 1960s/70s!

This could represent a major departure from conventional computer architecture! This is possibly a breakthrough!

Nvidia gets more competition!

"... Over the last eight years, [IBM] has been working on a new type of digital AI chip for neural inference, which he calls NorthPole. It’s an extension of TrueNorth, the last brain-inspired chip that [IBM] worked on prior to 2014. In tests on the popular ResNet-50 image recognition and YOLOv4 object detection models, the new prototype device has demonstrated higher energy efficiency, higher space efficiency, and lower latency than any other chip currently on the market, and is roughly 4,000 times faster than TrueNorth. ...
One of the biggest differences with NorthPole is that all of the memory for the device is on the chip itself, rather than connected separately. ...
But the biggest advantage of NorthPole is also a constraint: it can only easily pull from the memory it has onboard. ... Via an approach called scale-out, NorthPole can actually support larger neural networks by breaking them down into smaller sub-networks that fit within NorthPole’s model memory, and connecting these sub-networks together on multiple NorthPole chips. ... “We can’t run GPT-4 on this, but we could serve many of the models enterprises need,”... “And, of course, NorthPole is only for inferencing.” ..."

From the editor's note and abstract:
"Editor’s summary
The amount of data humans process and send around the globe on a daily basis is astonishing. However, the energy cost involved is high, and there is a strong need for designing energy-efficient devices. Modha et al. describe a chip with a neural inspired architecture, called NorthPole, that achieves substantially higher performance, energy efficiency, and area efficiency compared with other comparable architectures ... A key feature of this chip is the recognition that for almost all kinds of computing, access to memory plays as important a role as logic processing. Unlike analog in-memory computing, this purely digital system has the option of tailoring the bit precision as needed, which allows for optimization of the power usage. ...
Abstract
Computing, since its inception, has been processor-centric, with memory separated from compute. Inspired by the organic brain and optimized for inorganic silicon, NorthPole is a neural inference architecture that blurs this boundary by eliminating off-chip memory, intertwining compute with memory on-chip, and appearing externally as an active memory chip. NorthPole is a low-precision, massively parallel, densely interconnected, energy-efficient, and spatial computing architecture with a co-optimized, high-utilization programming model. On the ResNet50 benchmark image classification network, relative to a graphics processing unit (GPU) that uses a comparable 12-nanometer technology process, NorthPole achieves a 25 times higher energy metric of frames per second (FPS) per watt, a 5 times higher space metric of FPS per transistor, and a 22 times lower time metric of latency. Similar results are reported for the Yolo-v4 detection network. NorthPole outperforms all prevalent architectures, even those that use more-advanced technology processes."

‘Mind-blowing’ IBM chip speeds up AI IBM’s NorthPole processor sidesteps need to access external memory, boosting computing power and saving energy.

A new chip architecture points to faster, more energy-efficient AI A new chip prototype from IBM Research’s lab in California, long in the making, has the potential to upend how and where AI is used efficiently.


Credits: Last Week in AI

The NorthPole chip on a PCIe card.


No comments: