A Scalable and Systolic Architectures of Montgomery Modular Multiplication for Public Key Cryptosystems Based on DSPs

The arithmetic in a finite field constitutes the core of public key cryptography like RSA, ECC or pairing-based cryptography. This paper discusses an efficient hardware implementation of the Coarsely Integrated Operand Scanning (CIOS) method of Montgomery modular multiplication combined with an effe...

Full description

Saved in:
Bibliographic Details
Published inJournal of hardware and systems security Vol. 1; no. 3; pp. 219 - 236
Main Authors Mrabet, Amine, El-Mrabet, Nadia, Lashermes, Ronan, Rigaud, Jean-Baptiste, Bouallegue, Belgacem, Mesnager, Sihem, Machhout, Mohsen
Format Journal Article
LanguageEnglish
Published Cham Springer International Publishing 01.09.2017
Springer Nature B.V
Springer
Subjects
Online AccessGet full text
ISSN2509-3428
2509-3436
DOI10.1007/s41635-017-0018-x

Cover

More Information
Summary:The arithmetic in a finite field constitutes the core of public key cryptography like RSA, ECC or pairing-based cryptography. This paper discusses an efficient hardware implementation of the Coarsely Integrated Operand Scanning (CIOS) method of Montgomery modular multiplication combined with an effective systolic architecture designed with a two-dimensional array of processing elements. The systolic architecture increases the speed of calculation by combining the concepts of pipelining and the parallel processing into a single concept. We propose the CIOS method for the Montgomery multiplication using a systolic architecture. As far as we know, this is the first implementation of such design. The proposed architectures are designed for field programmable gate array platforms. They targeted to reduce the number of clock cycles of the modular multiplication. The presented implementation results of the CIOS algorithms focus on different security levels useful in cryptography. This architecture has been designed in order to use the flexible DSP48 on Xilinx Field-Programmable Gate Array’s. Our architecture is scalable and depends only on the number and size of words. For instance, we provide results of implementation for 8-, 16-, 32- and 64-bit-long words in 33, 66, 132 and 264 clock cycles. We highlight the fact that for a given number of word, the number of clock cycles is constant. We propose a general version of our systolic architecture presented in SPACE2016.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2509-3428
2509-3436
DOI:10.1007/s41635-017-0018-x