Open access
Author
Date
2023-02Type
- Bachelor Thesis
ETH Bibliography
yes
Altmetrics
Abstract
The ”Adaptive Compute Acceleration Platform” from AMD/Xilinx combines a classical FPGA part with a CGRA part, the so-called AI Engines, to open up new ways to increase the performance of powerful computers.
While ACAP based devices are now available for end users, it is still unclear what the actual capabilities of these devices are and how to use them most effectively. As the first step of this thesis, we examined the provided API and toolchain. We identified two mechanisms that are crucial for the performance of parallel applications. First, the communication between the AI Engines and the off-chip memory, and second the direct inter AI Engine communication. To perform various scientific benchmarks of the off-chip memory and the inter AI Engine communication the VCK190 Evaluation Kit from AMD/Xilinx was used. One part of the benchmarks focussed on the throughput of the communication between the AI Engines and the off-chip memory. The other part benchmarked the throughput of the inter AI Engine communication.
From the results, it is evident that the behaviour of the benchmarked communication mostly matches the manufacturer’s specifications. Nevertheless, there are unexplainable behaviour patterns that need further investigation.
The outlook shows that it might be interesting to benchmark the latencies of these communication methods in addition to the throughputs. Furthermore, it could be interesting to implement some additional communication APIs to improve the programmability of these AI Engines. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000600880Publication status
publishedPublisher
ETH ZurichOrganisational unit
03950 - Hoefler, Torsten / Hoefler, Torsten
Related publications and datasets
Is cited by: https://doi.org/10.3929/ethz-b-000635928
More
Show all metadata
ETH Bibliography
yes
Altmetrics