A Scalable and Systolic Architectures of Montgomery Modular Multiplication for Public Key Cryptosystems Based on DSPs
The arithmetic in a finite field constitutes the core of public key cryptography like RSA, ECC or pairing-based cryptography. This paper discusses an efficient hardware implementation of the Coarsely Integrated Operand Scanning (CIOS) method of Montgomery modular multiplication combined with an effe...
Saved in:
| Published in | Journal of hardware and systems security Vol. 1; no. 3; pp. 219 - 236 |
|---|---|
| Main Authors | , , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
Cham
Springer International Publishing
01.09.2017
Springer Nature B.V Springer |
| Subjects | |
| Online Access | Get full text |
| ISSN | 2509-3428 2509-3436 |
| DOI | 10.1007/s41635-017-0018-x |
Cover
| Summary: | The arithmetic in a finite field constitutes the core of public key cryptography like RSA, ECC or pairing-based cryptography. This paper discusses an efficient hardware implementation of the Coarsely Integrated Operand Scanning (CIOS) method of Montgomery modular multiplication combined with an effective systolic architecture designed with a two-dimensional array of processing elements. The systolic architecture increases the speed of calculation by combining the concepts of pipelining and the parallel processing into a single concept. We propose the CIOS method for the Montgomery multiplication using a systolic architecture. As far as we know, this is the first implementation of such design. The proposed architectures are designed for field programmable gate array platforms. They targeted to reduce the number of clock cycles of the modular multiplication. The presented implementation results of the CIOS algorithms focus on different security levels useful in cryptography. This architecture has been designed in order to use the flexible DSP48 on Xilinx Field-Programmable Gate Array’s. Our architecture is scalable and depends only on the number and size of words. For instance, we provide results of implementation for 8-, 16-, 32- and 64-bit-long words in 33, 66, 132 and 264 clock cycles. We highlight the fact that for a given number of word, the number of clock cycles is constant. We propose a general version of our systolic architecture presented in SPACE2016. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2509-3428 2509-3436 |
| DOI: | 10.1007/s41635-017-0018-x |