Behavioral synthesis of synchronous systems is a well established and researched area. The transformation of behavioral description into a datapath and control graph, and hence, to a structural realization usually requires three fundamental steps: 1) scheduling (the mapping of behavioral operations onto time slots); 2) allocation (the mapping of the behavioral operations onto abstract functional units); and 3) binding (the mapping of the functional units onto physical cells). Optimization is usually achieved by intelligent manipulation of these three steps in some way. Key to the operation of such a system is the (automatically generated) control graph, which is effectively a complex sequence generator controlling the passage of data through the system in time to some synchronizing clock. The maximum clock speed is dictated by the slowest time slot. (This is the timeslot containing the longest combinational logic delay.) Timeslots containing quicker (less) logic will effectively waste time: the output of the combinational logic in the state will have settled long before the registers reading the data are enabled. If we allow the state to change as soon as the data is ready, by introducing the concepts of "ready" and "acknowledge," the control graph becomes a disjoint set of single-state machines-it effectively disappears, with the consequence that the timeslot-timeslot transitions become self controlling. Having removed the necessity for the timeslots to be of equal duration the system becomes selftiming: asynchronous. This paper describes a behavioral asynchronous synthesis system based on this concept that takes as input an algorithmic description of a design and produces an asynchronous structural implementation. Several example systems are synthesized both synchronously and asynchronously (with no modification to the high level description). In keeping with the well-established observation that asynchronous systems operate at average case time complexity rather than worse case, the asynchronous structures usually operate some 30% faster than their synchronous counterparts, although interesting counterexamples are observed.
|Number of pages||17|
|Journal||IEEE Transactions on Very Large Scale Integration (VLSI) Systems|
|Publication status||Published - 1 Sept 2004|