Area-time efficient custom instructions are desirable for maximizing the performance of reconfigurable processors. Existing data path merging techniques based on resource sharing can be deployed to improve area efficiency of custom instructions. However, these techniques lead to large increase in the critical path delay. In this paper, we propose a novel strategy that takes into account the architectural constraints of the FPGA device in order to realize custom instructions with low-area delay product. The proposed strategy is based on partitioning the custom instruction data paths into a set of basic clusters such that they can be combined using a heuristic-based cluster merging process to maximize the utilization of FPGA logic blocks. Unlike the resource sharing method, the proposed cluster merging process does not maximize sharing of common resources and this leads to lesser reliance on multiplexers for implementing custom instructions. Resource sharing is only applied sparingly at the final stage to increase utilization of logic blocks. We show that the proposed technique leads to more than 34 percent, 34 percent, and 42 percent average reduction in area costs for Spartan-3, Virtex-4, and Virtex-5 architectures, respectively, when compared to optimizations achieved through commercial synthesis tool. We have also shown that the proposed technique leads to more than 18 percent, 17 percent, and 13 percent average reduction in area costs for Spartan-3, Virtex-4, and Virtex-5, respectively, when compared to results obtained using one of the most efficient resource sharing-based method reported in the literature. In addition, the proposed technique outperforms the resource sharing-based method in terms of area-delay product, with average reductions of more than 27 percent, 34 percent, and 19 percent for Spartan-3, Virtex-4, and Virtex-5, respectively.
- data-path design
- reconfigurable hardware
- automatic synthesis
- real-time and embedded systems