Architecture-aware technique for mapping area-time efficient custom instructions onto FPGAs

S K Lam, T Srikanthan, Christopher T Clarke

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

Area-time efficient custom instructions are desirable for maximizing the performance of reconfigurable processors. Existing data path merging techniques based on resource sharing can be deployed to improve area efficiency of custom instructions. However, these techniques lead to large increase in the critical path delay. In this paper, we propose a novel strategy that takes into account the architectural constraints of the FPGA device in order to realize custom instructions with low-area delay product. The proposed strategy is based on partitioning the custom instruction data paths into a set of basic clusters such that they can be combined using a heuristic-based cluster merging process to maximize the utilization of FPGA logic blocks. Unlike the resource sharing method, the proposed cluster merging process does not maximize sharing of common resources and this leads to lesser reliance on multiplexers for implementing custom instructions. Resource sharing is only applied sparingly at the final stage to increase utilization of logic blocks. We show that the proposed technique leads to more than 34 percent, 34 percent, and 42 percent average reduction in area costs for Spartan-3, Virtex-4, and Virtex-5 architectures, respectively, when compared to optimizations achieved through commercial synthesis tool. We have also shown that the proposed technique leads to more than 18 percent, 17 percent, and 13 percent average reduction in area costs for Spartan-3, Virtex-4, and Virtex-5, respectively, when compared to results obtained using one of the most efficient resource sharing-based method reported in the literature. In addition, the proposed technique outperforms the resource sharing-based method in terms of area-delay product, with average reductions of more than 27 percent, 34 percent, and 19 percent for Spartan-3, Virtex-4, and Virtex-5, respectively.
Original languageEnglish
Pages (from-to)680-692
Number of pages13
JournalIEEE Transactions on Computers
Volume60
Issue number5
Early online date17 Dec 2010
DOIs
Publication statusPublished - May 2011

Fingerprint

Merging
Percent
Field Programmable Gate Array
Field programmable gate arrays (FPGA)
Resource Sharing
Costs
Maximise
Logic
Architecture
Path
Critical Path
Partitioning
Sharing
Heuristics
Synthesis
Resources
Optimization

Keywords

  • data-path design
  • reconfigurable hardware
  • automatic synthesis
  • real-time and embedded systems

Cite this

Architecture-aware technique for mapping area-time efficient custom instructions onto FPGAs. / Lam, S K; Srikanthan, T; Clarke, Christopher T.

In: IEEE Transactions on Computers, Vol. 60, No. 5, 05.2011, p. 680-692.

Research output: Contribution to journalArticle

Lam, S K ; Srikanthan, T ; Clarke, Christopher T. / Architecture-aware technique for mapping area-time efficient custom instructions onto FPGAs. In: IEEE Transactions on Computers. 2011 ; Vol. 60, No. 5. pp. 680-692.
@article{e717cde5119147849d22f3b2985b0e83,
title = "Architecture-aware technique for mapping area-time efficient custom instructions onto FPGAs",
abstract = "Area-time efficient custom instructions are desirable for maximizing the performance of reconfigurable processors. Existing data path merging techniques based on resource sharing can be deployed to improve area efficiency of custom instructions. However, these techniques lead to large increase in the critical path delay. In this paper, we propose a novel strategy that takes into account the architectural constraints of the FPGA device in order to realize custom instructions with low-area delay product. The proposed strategy is based on partitioning the custom instruction data paths into a set of basic clusters such that they can be combined using a heuristic-based cluster merging process to maximize the utilization of FPGA logic blocks. Unlike the resource sharing method, the proposed cluster merging process does not maximize sharing of common resources and this leads to lesser reliance on multiplexers for implementing custom instructions. Resource sharing is only applied sparingly at the final stage to increase utilization of logic blocks. We show that the proposed technique leads to more than 34 percent, 34 percent, and 42 percent average reduction in area costs for Spartan-3, Virtex-4, and Virtex-5 architectures, respectively, when compared to optimizations achieved through commercial synthesis tool. We have also shown that the proposed technique leads to more than 18 percent, 17 percent, and 13 percent average reduction in area costs for Spartan-3, Virtex-4, and Virtex-5, respectively, when compared to results obtained using one of the most efficient resource sharing-based method reported in the literature. In addition, the proposed technique outperforms the resource sharing-based method in terms of area-delay product, with average reductions of more than 27 percent, 34 percent, and 19 percent for Spartan-3, Virtex-4, and Virtex-5, respectively.",
keywords = "data-path design, reconfigurable hardware, automatic synthesis, real-time and embedded systems",
author = "Lam, {S K} and T Srikanthan and Clarke, {Christopher T}",
year = "2011",
month = "5",
doi = "10.1109/tc.2010.237",
language = "English",
volume = "60",
pages = "680--692",
journal = "IEEE Transactions on Computers",
issn = "0018-9340",
publisher = "IEEE",
number = "5",

}

TY - JOUR

T1 - Architecture-aware technique for mapping area-time efficient custom instructions onto FPGAs

AU - Lam, S K

AU - Srikanthan, T

AU - Clarke, Christopher T

PY - 2011/5

Y1 - 2011/5

N2 - Area-time efficient custom instructions are desirable for maximizing the performance of reconfigurable processors. Existing data path merging techniques based on resource sharing can be deployed to improve area efficiency of custom instructions. However, these techniques lead to large increase in the critical path delay. In this paper, we propose a novel strategy that takes into account the architectural constraints of the FPGA device in order to realize custom instructions with low-area delay product. The proposed strategy is based on partitioning the custom instruction data paths into a set of basic clusters such that they can be combined using a heuristic-based cluster merging process to maximize the utilization of FPGA logic blocks. Unlike the resource sharing method, the proposed cluster merging process does not maximize sharing of common resources and this leads to lesser reliance on multiplexers for implementing custom instructions. Resource sharing is only applied sparingly at the final stage to increase utilization of logic blocks. We show that the proposed technique leads to more than 34 percent, 34 percent, and 42 percent average reduction in area costs for Spartan-3, Virtex-4, and Virtex-5 architectures, respectively, when compared to optimizations achieved through commercial synthesis tool. We have also shown that the proposed technique leads to more than 18 percent, 17 percent, and 13 percent average reduction in area costs for Spartan-3, Virtex-4, and Virtex-5, respectively, when compared to results obtained using one of the most efficient resource sharing-based method reported in the literature. In addition, the proposed technique outperforms the resource sharing-based method in terms of area-delay product, with average reductions of more than 27 percent, 34 percent, and 19 percent for Spartan-3, Virtex-4, and Virtex-5, respectively.

AB - Area-time efficient custom instructions are desirable for maximizing the performance of reconfigurable processors. Existing data path merging techniques based on resource sharing can be deployed to improve area efficiency of custom instructions. However, these techniques lead to large increase in the critical path delay. In this paper, we propose a novel strategy that takes into account the architectural constraints of the FPGA device in order to realize custom instructions with low-area delay product. The proposed strategy is based on partitioning the custom instruction data paths into a set of basic clusters such that they can be combined using a heuristic-based cluster merging process to maximize the utilization of FPGA logic blocks. Unlike the resource sharing method, the proposed cluster merging process does not maximize sharing of common resources and this leads to lesser reliance on multiplexers for implementing custom instructions. Resource sharing is only applied sparingly at the final stage to increase utilization of logic blocks. We show that the proposed technique leads to more than 34 percent, 34 percent, and 42 percent average reduction in area costs for Spartan-3, Virtex-4, and Virtex-5 architectures, respectively, when compared to optimizations achieved through commercial synthesis tool. We have also shown that the proposed technique leads to more than 18 percent, 17 percent, and 13 percent average reduction in area costs for Spartan-3, Virtex-4, and Virtex-5, respectively, when compared to results obtained using one of the most efficient resource sharing-based method reported in the literature. In addition, the proposed technique outperforms the resource sharing-based method in terms of area-delay product, with average reductions of more than 27 percent, 34 percent, and 19 percent for Spartan-3, Virtex-4, and Virtex-5, respectively.

KW - data-path design

KW - reconfigurable hardware

KW - automatic synthesis

KW - real-time and embedded systems

UR - http://www.scopus.com/inward/record.url?scp=79953220481&partnerID=8YFLogxK

UR - http://dx.doi.org/10.1109/tc.2010.237

U2 - 10.1109/tc.2010.237

DO - 10.1109/tc.2010.237

M3 - Article

VL - 60

SP - 680

EP - 692

JO - IEEE Transactions on Computers

JF - IEEE Transactions on Computers

SN - 0018-9340

IS - 5

ER -