# Low Power Bus Architectures in Nano-Technology, Study and Analysis

Ghashmi Bin Talib<sup>1</sup> <u>ghbintalib06@cit.just.edu.jo</u> Abdoul Rjoub<sup>1</sup> <u>abdoul@just.edu.jo</u> Odysseas Koufopavlou<sup>2</sup> Odysseas@ece.upatras.gr

<sup>1</sup>Jordan University of Science and Technology Faculty of Computer and Information Technology Department of Computer Engineering Irbid 22110, P. O. Box 3030, Jordan <sup>2</sup>Patras University Department of Electrical & Computer Engineering VLSI Design Laboratory Rion 26610, Greece

Abstract—The total power consumed by computer system depends upon the efficiency of its bus architecture. So the designers have attempted to invent low power bus architecture. This paper describes and compares the features of different techniques for low power bus architectures. The current state of the art of bus architecture will be the focus of this paper, where solutions are discussed related to the amount of dissipated power. Also this paper points out the delay time of the studied architectures. We have simulated all buses circuits using three different predictive technology models (PTM) from Berkeley.

**Keywords:** Bus Architecture, Low Swing Voltage, Leakage Current, Nano-Technology, Power Dissipation.

## I. INTRODUCTION

THE revolution of portable devices and the growing complexity of the VLSI circuits make the power reduction one of the most important issues for computer system design. Different approaches and techniques are issued for power reduction in system design, either considering the circuit level or architecture level. Bus designs or long lines interconnections consume more than 45% from total power dissipation in system design [1]. Therefore, it is important issue for portable devices and handheld equipment to reduce the power dissipation in all components of the design in order to increase the life of battery, this problem will increase exponentially in the next decade when the number of transistors in on single chip will exceed hundred of millions on transistors using the nano-technology SPICE parameters [2]. In that case, the chip size and bus width will increase, and the total bus wiring capacitance will become considerably large, therefore, the power dissipation in the bus architectures will be the most significant portion in the chip design[3]. Various approaches and techniques to reduce power dissipation over bus designs are proposed and submitted. Some of the most efficient techniques of reducing the power dissipation are low swing voltage technique, multiple supply voltages, multiple threshold voltages and bus coding[2-4].

In this paper, its adopted the low swing voltage technique

because of it is efficiency comparing with other techniques of the same purpose under nano-scale SPICE parameters.

## II. BUS ARCHITECTURE

As mentioned before, the dissipation power on busses can achieve more than 45% from the total power dissipated in a VLSI chip [4]; therefore it is important issue to select efficient and low power drivers and receivers in the bus to reduce the power dissipation at lowest value. For this purpose, it is a demand to have a comprehensive knowledge about available bus drivers and bus receivers to reduce the total power dissipation of the chip.

This importance of the bus design will be increased especially when the nano-technology SPICE parameters will be used frequently in the next decade, the leakage current in specific and static power in general will be increased also, this fact should be considered when any design is proposed. In the next section, Five types of bus designs are compared between them under the same circumstances of SPICE parameters, simulation time and load capacitance.

#### AN EFFICIENT LOW-POWER BUS ARCHIITECTURE (BUS1)

In 1997, A. Rjoub, et al. proposed a reduced voltage swingbased bus driver and receiver circuits, with address "An Efficient Low-Power Bus Architecture" [1]. They inserted an nMOS transistor between the pMOS and nMOS transistors of a simple inverter to reduce the output voltage swing of the driver, Fig. 1. The receiver circuit has been built based on the voltage sense transistor, Fig. 2. Bus1 uses a repeater model to reduce propagation delay time due to long lines interconnection. The repeater is a combination of driver and receiver circuits, Fig. 3.

## BUS ARCHITECTURE FOR LOW-POWER VLSI DIGITAL CIRCUITS (BUS2)

In 1996, "Bus Architecture for Low-Power VLSI Digital Circuits" was proposed by G. Cardarilli, et al. [2]. They reduce the dissipated power based on reducing voltage swing. The driver circuit is shown in Fig. 4. The receiver circuit as shown in Fig. 5 has been built based on sense amplifying flip-flop [8], which gives a full swing signal from differential input.

#### EFFICIENT CMOS DRIVER-RECEIVER PAIR WITH LOW-SWING SIGNALING FOR ON-CHIP INTERCONNECTS (BUS3)

In 2007, S. Nooshabadi, et al. proposed a new bus driver and receiver called (mj-sib) as shown in Fig. 6, entitled "Efficient CMOS Driver-Receiver Pair with Low-Swing Signaling for On-Chip Interconnects"[3], and compared their proposed circuits with other two previous works, ddc-db [5] and asf-lc [6] [7]. Fig. 6 shows the driver-receiver schemes as shown on [3].

## Low Swing Signaling Using a Dynamic Diode-Connected Driver (bus4)

DDC-DB was proposed on September, 2001 by M. Ferretti, et al. [5]. The authors introduced a new driver circuit and used simple inverter as a receiver. We used the same driverreceiver scheme as in [3], Fig. 7.

## LOW-SWING ON-CHIP SIGNALING TECHNIQUES: EFFECTIVENESS AND ROBUSTNESS (BUS5)

In asf-lc [6], [7], the authors have reviewed a number of low-swing interconnect schemes. In this paper we used the same scheme as in [3], where the combination of high performance source follower driver from [6] at the transmitter end and the matching level restorer circuit from [7] at the receiver end [3], Fig. 8.



Fig. 1: Schematic of the driver circuit [1].



Fig. 4: Schematic of the driver circuit [2].



Fig. 5: Schematic of the receiver circuit [2].

#### **III. TEST ARCHITECTURE**

In this paper the same test platform has been used in [6], [5] and [3] was applied as shown in Fig. 9. All bus schemes were examined for three different spice parameters. In 130nm scale, Vdd =1.3 V is applied on Bus1, and Bus2, and Vdd = 1.0V, Vddh = 1.2V and Vddl = 0.85V were applied on Bus3, Bus4 and Bus5 as in [3]. In 45nm scale, Vdd = 1.1V, Vddh = 1.28V and 0.92V were applied on all buses.

In 22nm scale, Vdd = 0.8V, Vddh = 1.0V and 0.65V were applied on all buses.

We divided the test of the mentioned buses into two parts in first part all circuits were simulated with a receiver load capacitance ranging from 10fF to 100fF, and we used an interconnect line of metal–3 layer with typical length of 1mm, modeled by a  $\pi$ 3 distributed RC model (R<sub>w</sub> = 300 $\Omega$  and C<sub>w</sub> = 0.23pF) with an extra capacitive load C<sub>L</sub>= 1770fF distributed along the wire. In the second part all circuits were simulated with a receiver load capacitance of 20fF, and variable length wire with a range from 1mm to 10mm, modeled by a  $\pi$ 3 distributed RC model (R<sub>w</sub> = 300 $\Omega$ /1mm and C<sub>w</sub> = 0.23pF/mm) with an extra capacitive load C<sub>L</sub>= 1.77pF/1mm. In case of Bus1, the wire is divided by two repeaters into three segments as shown in Figure 9.



#### IV. SIMULATION RESULTS AND COMPARISON

Fig. 10 shows energy dissipation versus  $C_L$ , using 130nm technology. From Fig. 10 we see that Bus1 works very well and gives the best reduction in energy dissipation compared with Bus2 [2], Bus3 [3], Bus4 [5], and Bus5 [6] [7] by 73.46%, 68.82%, 95.59%, and 51.53% respectively at  $C_L = 50$  fF. Also Bus5 gives 59.79%, 52.75%, and 93.32% reduction in energy dissipation at  $C_L = 50$  fF compared with Bus2, Bus3, and Bus4 respectively.

Fig. 11 shows propagation delay versus  $C_L$ , using 130nm technology, from the same Figure it shown that Bus2 suffers form weakness in its performance, in contrast, the performance of Bus4 is the best then Bus3. Bus4 performs 48.59%, 92.19%, 21.92%, and 35.29%, better than Bus1, Bus2, Bus3, and Bus5 at  $C_{LOAD} = 50$  fF, respectively.

Fig. 12 shows energy dissipation versus  $C_L$ , using 45nm technology. From Fig. 12 it shows that Bus4 has the worst reduction in energy dissipation, then Bus3. Bus5 has the best reduction and it's not comparable with other buses; so its curve not appeared in Fig. 12, it reduces the energy dissipation by 99.63%, 99.82%, 99.98%, and 99.99%, better than Bus1, Bus2, Bus3, and Bus4 at  $C_L = 50$  fF, respectively.

Fig. 13 shows propagation delay versus  $C_L$ , using 45nm technology, it is noted that Bus2 has worst performance, in contrast, the performance of Bus4 is the best and performs 79.83%, 90.99%, 38.95%, and 44.90%, better than Bus1, Bus2, Bus3, and Bus5 at  $C_{LOAD} = 50$  fF, respectively.

From Fig. 14 it showed that Bus1 works very well using 22nm and it reduces the energy consumption at  $C_{LOAD} = 50$  fF by 45.07%, 97.74%, and 97.57% less than Bus2, Bus3, Bus4 respectively. But Bus5 is the best, and due to the large range of variation on energy dissipation among the buses, the curve of



Fig. 8: Schematic of the driver-receiver (Bus5), [3], [6], [7]



Fig. 9: Interconnect Scheme (a) Test Architecture and (b) the  $\pi$  Wire Model.

Bus5 does not appear. Bus5 reduces the energy dissipation at  $C_L = 50$  fF by 99.856% less than Bus1. Also Bus2 gives 95.89% and 95.57% reduction in energy dissipation at  $C_{LOAD}$ = 50 fF compared with Bus3, and Bus4 respectively.

Fig. 15 shows propagation delay versus  $C_L$  using 22nm technology. Bus4 has the best performance. It performs 75.17%, 73.51%, 30.90%, and 54.93% better than Bus1, Bus2, Bus3, and Bus5, respectively. But as shown in Fig. 14 Bus4 consumes large amount of energy relatively.

From energy delay product, Figure 16, 17, and 18, it showed that Bus5 and Bus1 are suitable to use in Nano-scale systems rather than Bus4 and Bus3.

Fig. 19, 20, and 21 show the energy delay product versus the wire-length for the three SPICE parameters 130nm, 45nm, and 22nm respectively. In Fig. 19, using 130nm technology, we see that Bus5 performs 84.07% better than Bus4, respectively at wire-length of 10mm. Bus3 failed the test for values of the wire-length exceeds 3mm, also Bus1 failed the test for values of the wire-length exceeds 6mm and Bus2 failed the test for most values of the wire-length. In Fig. 20, using 45nm technology, it showed that Bus1 is the best, it performs 56.59% better than Bus4, but other buses failed the test. In

Fig. 21, using 22nm, it showed that Bus2, and Bus4 work well, and the other buses failed the test. Form Fig. 19, 20, and 21, it noticed that Bus4 still works well; this indicates that Bus4 could be able to drive a load with large fanout.



Fig. 10: Total energy dissipation versus the receiver output load capacitance using 130nm technology.



Fig. 11: Propagation delay time versus the receiver output load capacitance using 130nm technology.







Fig. 13: Propagation delay time versus the receiver output load capacitance using 45nm technology.



Fig.14: Total energy dissipation versus the receiver output load capacitance using 22nm technology.



Fig. 15: Propagation Delay versus the receiver output load capacitance using 22nm technology.



Fig. 16: Energy Delay Product versus the receiver output load capacitance using 130nm technology.



Fig. 17: Energy Delay Product versus the receiver output load capacitance using 45nm technology.



Fig. 18: Energy Delay Product versus the receiver output load capacitance using 22nm technology.



Fig. 19: Energy Delay Product versus the wire-length using 130nm technology.



Fig. 20: Energy Delay Product versus the wire-length using 45nm technology.



Fig. 21: Energy Delay Product versus the wire-length using 22nm technology.

#### V. CONCLUSION

We have introduced a number of existing low-swing interconnect interface-circuit schemes and we have compared their efficiency, and performance. Some of them have good performance using 130nm technology but not good when using 45nm and 22nm technology such as Bus3. Another has good efficiency in Nano-scale such as Bus1. We note that the energy dissipation increases when the scale is decreased; due increasing static dissipation power. But the performance is increasing; due to reducing power supply and reducing distance between source gate and drain gate. It shows that Bus1 and Bus2 are good to be used in low power systems, and Bus3 and Bus4 are good to be used in high performance (speed) systems, but Bus5 compromise between efficiency and performance and it is very suitable to be used in ultra-low power systems. From Energy delay product, Figure 16, 17, and 18, it showed that Bus5 and Bus1 are suitable to be used in Nano-scale systems, and Bus4 and Bus3 are not suitable. Reducing voltage swing on interconnect is a powerful tool for minimizing energy dissipation, but requires more optimization especially when nano-technology is used.

#### REFERENCES

- A. Rjoub, S. Nikolaidis, O. Koufopavlou, and T. Stouraitis, "An Efficient Low-Power Bus Architecture," 1997 IEEE International Symposium on Circuits and Systems, June 9-12, 1997, Hong Kong.
- [2] Gian Carlo Cardarilli , Marcello Salmeri, Adelio Salsano, and Osvaldo Simonelli, "Bus Architecture for Low-Power VLSI Digital Circuits," 1996 IEEE 7803-3073.
- [3] J. C. Garc'ıa, J. A. Montiel–Nelson, and Saeid Nooshabadi, "Efficient CMOS Driver-Receiver Pair with Low-Swing Signaling for On-Chip Interconnects" IEEE 18th European Conference on Circuit Theory and Design, August 2007, Sevilla Spain.
- [4] Dake Liu, Christer Svensson. "Power Consumption Estimation in CMOS VLSI Chips", IEEE Journal of Solid-state Circuits, vol.29 no.6, June 1994.
- [5] M. Ferretti, and P. A. Beerel, "Low swing signaling using a dynamic diode–connected driver," Solid–State Circuits Conference, Sep. 2001, Villach, Austria, pp. 369–372.
- [6] H. Zhang, V. George, and J. M. Rabaey, "Low-swing on-chip signaling techniques: effectiveness and robustness," IEEE Tran. on VLSI Syst., vol. 8, no. 3, pp. 264–272, Jun. 2000.
- [7] S. H. Kulkarni, and D. Sylvester, "High Performance Level Conversion for Dual VDD Design," IEEE Tran. on VLSI Syst., vol. 12, no. 9, pp. 926– 936, Sep. 2004.
- [8] Masataka Matsui, Hiroyuki Hara, Yoshiharu Uetani, Lee-Sup Kim, Tetsu Nagamatsu, Yoshinori Watanabe, Akihiki Chiba, Kouji Matsuda, Takayasu Sakurai, "A 200 MHz 13 mmz 2-D DCT Macrocell Using Sense-Amplifling Pipeline Flip-Flop Scheme", IEEE Journal of Solid-state Circuits, vol.29 no. 12, December 1994.