Machine learning and artificial intelligence needs better computer hardware! I bet, we will see more progress in this respect in 2023!
I also blogged here yesterday about this subject.
"In a recent study, researchers in Hong Kong report a new reconfigurable processor, dubbed ReAAP, that outperforms several computing platforms commonly used to support deep neural networks (DNNs) ...
“Reconfigurable processors combine the advantages of software flexibility and hardware parallelism,” ...
“As an end-to-end system, ReAAP can be deployed to accelerate various deep-learning applications just by customizing a Python script in [the] software for each application,” ..."
“Reconfigurable processors combine the advantages of software flexibility and hardware parallelism,” ...
“As an end-to-end system, ReAAP can be deployed to accelerate various deep-learning applications just by customizing a Python script in [the] software for each application,” ..."
From the abstract:
"Parallelism and data reuse are the most critical issues for the design of hardware acceleration in a deep learning processor. Besides, abundant on-chip memories and precise data management are intrinsic design requirements because most of deep learning algorithms are data-driven and memory-bound. In this paper, we propose a compiler-architecture co-design scheme targeting a reconfigurable and algorithm-oriented array processor, named ReAAP. Given specific deep neural networks, the proposed co-design scheme is effective to perform parallelism and data reuse optimization on compute-intensive layers for guiding reconfigurable computing in hardware. Especially, the systemic optimization is performed in our proposed domain-specific compiler to deal with the intrinsic tensions between parallelism and data locality, for the purpose of automatically mapping diverse layer-level workloads onto our proposed reconfigurable array architecture. In this architecture, abundant on-chip memories are software-controlled and its massive data access is precisely handled by compiler-generated instructions. In our experiments, the ReAAP is implemented on an embedded FPGA platform. Experimental results demonstrate that our proposed co-design scheme is effective to integrate software flexibility with hardware parallelism for accelerating diverse deep learning workloads. As a whole system, ReAAP achieves a consistently high utilization of hardware resource for accelerating all the diverse compute-intensive layers in ResNet, MobileNet, and BERT."
ReAAP: A Reconfigurable and Algorithm-Oriented Array Processor With Compiler-Architecture Co-Design (open access)
Fig. 3. Proposed reconfigurable array architecture.
No comments:
Post a Comment