TY - JOUR
T1 - An OpenGL Compliant Hardware Implementation of a Graphic Processing Unit Using Field Programmable Gate Array–System on Chip Technology
AU - Beasley, Alexander E.
AU - Clarke, Christopher T.
AU - Watson, Robert J.
PY - 2020/9/1
Y1 - 2020/9/1
N2 - FPGA-SoC technology provides a heterogeneous platform for advanced, high-performance systems. The System on Chip (SoC) architecture combines traditional single and multiple core processor topologies with flexible FPGA fabric. Dynamic reconfiguration allows the hardware accelerators to be changed at run-time. This article presents a novel OpenGL compliant GPU design implemented on an FPGA. The design uses an FPGA-SoC environment allowing the embedded processor to offload graphics operation onto a more suitable architecture. To the authors’ knowledge, this is a first. The graphics processor consists of GLSL compliant shaders, an efficient Barycentric Rasterizer, and a draw mode manager. Performance analysis shows the throughput of the shaders to be hundreds of millions of vertices per second. The design uses both pipelining and resource reuse to optimise throughput and resource use, allowing implementation on a low-cost, FPGA device. Pixel processing rates from this implementation are almost 80% higher than other FPGA implementations. Power consumption compared with comparative embedded devices shows the FPGA consuming as little as 2% of the power of a Mali device, and an up to 11.9-fold increase in efficiency compared to an Nvidia RTX 2060 - Turing architecture device.
AB - FPGA-SoC technology provides a heterogeneous platform for advanced, high-performance systems. The System on Chip (SoC) architecture combines traditional single and multiple core processor topologies with flexible FPGA fabric. Dynamic reconfiguration allows the hardware accelerators to be changed at run-time. This article presents a novel OpenGL compliant GPU design implemented on an FPGA. The design uses an FPGA-SoC environment allowing the embedded processor to offload graphics operation onto a more suitable architecture. To the authors’ knowledge, this is a first. The graphics processor consists of GLSL compliant shaders, an efficient Barycentric Rasterizer, and a draw mode manager. Performance analysis shows the throughput of the shaders to be hundreds of millions of vertices per second. The design uses both pipelining and resource reuse to optimise throughput and resource use, allowing implementation on a low-cost, FPGA device. Pixel processing rates from this implementation are almost 80% higher than other FPGA implementations. Power consumption compared with comparative embedded devices shows the FPGA consuming as little as 2% of the power of a Mali device, and an up to 11.9-fold increase in efficiency compared to an Nvidia RTX 2060 - Turing architecture device.
U2 - 10.1145/3410357
DO - 10.1145/3410357
M3 - Article
SN - 1936-7406
VL - 14
JO - ACM Transactions on Reconfigurable Technology and Systems
JF - ACM Transactions on Reconfigurable Technology and Systems
IS - 1
M1 - 2
ER -