Report Date: May 1, 1996 MOSIS Account: From: Joseph Cavallaro, cavallar@rice.edu Designer: Frank Alejano, John Cusey, Chris Pickett Design Description: 16-bit Encryption Chip Design Name: Rowboat MOSIS Fab ID: ISIN5CRFZ2 MOSIS Design ID: Technology: 2 micron tinychip Design Size: 2000 x 2000 microns DESCRIPTION: 16-bit data with 14-bit key encryption chip operating on an algorithm roughly equivalent to a scaled-down version of DES encryption. BACKGROUND: The Rowboat Encryption Chip has three basic functions: the input, which takes the data or key input serially by byte and routes it to the correct data or key latches; the output, which takes the encrypted data and outputs it serially by byte; and the encryption unit, which performs the actual encryption. The input and the output sections perform essentially in the same way. For the input, we need to get 16 inputted eight bits at a time. Obviously, two clock cycles is all that is reuired for all the data or key to be resident in the appropriate latches. Changes in the inputs to the chip will not have any effect on operation provided that the changes occur outside of the first two clock cycles of any encryption cycle. In other words, the inputs are sampled during the first two clock cycles and nowhere else; the user need not guarentee anything in particular about the input outside of this period in order to ensure proper chip operation. Nothing can be inputted unless the enable pin is high; once all necessary information has been inputted, the state of the enable pin is irrelevant: the user need not ensure that it is either high or low to ensure correct chip operation. The output is much the same: two clock cycles are required to output the 16 bits of encrypted data eight bits at a time. There will be data on the output pins at all times, but it will be valid only for two clock cycles following when the output valid pin goes high: in other words, the data on the output pins is only valid for the clock cycle in which the output valid pin goes high and for the clock cycle following this event. Data should be inputted with the least significant byte first; similarly, the data will be outputted least significant byte first. The encryption section of the chip will consist of two subsections: the key section, which stores the key and selects the the appropriate bits of it to use in each round of encryption; and the functional section, which takes the data and the key and performs the rounds of encrytion using them. The key resides in a 14-bit latch. Bits 8 and 15 from the input are discarded to get the 16-bit input into the 14-bit key. All logic and routing necessary to perform the key shifting and selection can be done in one clock cycle. The functional section actually performs the the rounds of encryption. Each round of encryption consists of several discrete steps: the original right half of the data must be stored in a temporary latch and the right half of the data must be XORed with the shifted and selected key, the result being lached in another temporary register; the output from this temporary register must be latched in another temporary register; the output from this second temporary register must be fed back and XORed with the left half of the data, with the result being latched in the first temporary register; the data from the first temporary register must be lached into the second temporary register; and the data from the second temporary register must be latched into the right half register, while the data from the original right temporary register, now residing in yet another temporary register, must be latched into the left half register. All of this takes three clock cycles, so all four rounds of encryption take a total of 12 clock cycles. The normal mode of operation for the chip would be for the user to assert the restart input, assert the the enable input, assert the DorK input (which tells the chip that the key is about to be entered), enter the key one byte at a time, deassert the DorK input, enter the data one byte at a time, wait 12 clock cycles, get the encrypted output in the next 2 clock cycles (when the output valid output is asserted), and then enter new data for another run. Therefore, if we assume that key is already valid, then the latency for encrytion will be 16 clock cycles: 2 for data input, 12 for encryption, and 2 for data output. It should be noted that there is nothing in the circuitry that requires the user to input the key upon restart. In other words, the user could power up the chip and immediately start feeding data into it for encryption. This would mean that the key with which the data was encrypted would be unknown and it would be impossible for the cypher text to be decrypted into clear text. TESTING: The procedure used for testing the chips was not all that sophisticated. Since our design was such that no paticular data or key should have been any more likely to cause problems than any other data or key, we essentially tested random inputs and keys on the different chips. Two of the chips were completely functional and two of the chips were completely nonfunctional. The functional chips correctly encrypted all of the test vectors that they were presented with, and there is no reason to suspect that either one would have worked correctly for all possible inputs, even though an exhaustive test of all inputs is not feasible and was not performed. The maximal clock frequency that the two functional chips worked at was 17 MHz. Since a two-phase non-overlapping clock was used, this translates into effective maximum clock frequency of around 4.25 MHz. This is lower than we had hoped for but still acceptable. The failure mode of the two non-functional chips was identical: seven of the eight output bits remained stuck either at Vdd or GND regardless of any changes in the input. The reason for this failure is unknown to us. Examination of the dies themselves revelaed no obvious fabrication errors, and the fact that the other two chips work as expected indicates that the error is probably not a gross design problem. The most likely scenario that we can propose for the failure is as follows: The size of our structures did not vary regardless of the number of inputs an output was driving (i.e., we did not use bigger buffers to drive lots of other inputs than we did to drive only a few inputs). It seems plausible that at various points in the output or other circuits we were attempting to drive too much with too small a structure, resulting in the stuck-at faults that we observed. The reason that any chips were functional at all most likely is a result of luck. CONCLUSION: Overall, the design goals of the project were generally met. Submitted by: Frank Alejano, John Cusey, and Chris Pickett. Further information can be obtained at the URL http://www.owlnet.rice.edu/~pickettc/elec422/index.html