An Algorithmic and Architectural Study on Montgomery Exponentiation in RNS pdf

**project girl** · 25-08-2017, 09:32 PM

An Algorithmic and Architectural Study on Montgomery Exponentiation in RNS

.pdf

1An Algorithmic and Architectural.pdf (Size: 2.21 MB / Downloads: 35)

Abstract

The modular exponentiation on large numbers is computationally intensive. An effective way for performing this operation
consists in using Montgomery exponentiation in the Residue Number System (RNS). This paper presents an algorithmic and
architectural study of such exponentiation approach. From the algorithmic point of view, new and state-of-the-art opportunities that
come from the reorganization of operations and precomputations are considered. From the architectural perspective, the design
opportunities offered by well-known computer arithmetic techniques are studied, with the aim of developing an efficient arithmetic cell
architecture. Furthermore, since the use of efficient RNS bases with a low Hamming weight are being considered with ever more
interest, four additional cell architectures specifically tailored to these bases are developed and the tradeoff between benefits and
drawbacks is carefully explored. An overall comparison among all the considered algorithmic approaches and cell architectures is
presented, with the aim of providing the reader with an extensive overview of the Montgomery exponentiation opportunities in RNS.

INTRODUCTION

THE hardware implementation of modular exponentiation
for very large integers plays an important role in
various fields, being security among the most notable ones.
In recent years, many research efforts specifically focused
on the hardware implementation of Montgomery exponentiation
(ME) in the residue number system (RNS). In RNS,
very long integer multiplications and additions can be split
in independent short integer operations. However, other
operations such as division and modular reduction may be
difficult to execute [1]. ME, which is based on Montgomery
multiplication (MM) [2], can be performed by avoiding the
standard modular reduction approach.

ALGORITHMIC STUDY

In this section, a suitable reorganization of the operations
and of precomputations capable of reducing the number of
multiplications in RNS ME algorithms is described. The RNS
ME corresponds to a classic square and multiply algorithm,
where RNS MMs are iterated in a loop. The RNSMMfollows
the normal Montgomery multiplication approach, but it is
adapted to RNS. The main difference is the presence of two
BEs, which are required by RNS MM in order to execute the
modular reduction. The BE corresponds to an algorithm that
calculates a value represented on an RNS base on a different
base. Since the RNS ME uses the RNS MM that, in turn,
requires the BE, after a brief background on RNS the three
algorithms will be discussed separately by following a topdown
presentation approach.

RNS Montgomery Multiplication

In the state-of-the-art algorithms [4], [7], RNS MM is
performed on two RNS bases, A and B. Since B is used as
the Montgomery constant, B1
A must be precomputed on the
base A, where B1
A is the multiplicative inverse on A of B.
In Fig. 2, the state-of-the-art and reorganized RNS MM
algorithms are presented (and, as before, [4] and [7] are
treated together).Bis used as theMontgomery constant; thus,
by executing the multiplication of Step 3 in B, the modular
reduction byBis immediate. Since a division byBis required,
which must be executed in a base composed by elements
relatively prime to B, the BE to A of Step 4 is performed. The
multiplication in Step 5 and the addition in Step 6 are only
executed onA, since the result of Step 6 onB wouldbe equal to
0. The multiplication by B1 in Step 7, which corresponds to
the division, is only performed onA, as previously described.
Afinal BE to B is required, in order to reach a valid result that
could be passed in input to other MMs.

BE Approach by Bajard et al.

The MM algorithm proposed by Bajard et al. in [7] (and
applied in the context of RNS ME in [5]) uses two BE
algorithms; Algorithm 9 in Fig. 5 shows the first one, which
does not perform the BE correction in order to save time.
Algorithm 11 in Fig. 6, originally proposed by Shenoy and
Kumaresan [8], calculates the correct result. The error in the
result of the first BE does not affect the final result of the
MM, but larger bases are required, i.e., A;B > Nðk þ 2Þ2.
Algorithm 10 in Fig. 5 corresponds to the first reorganized
BE based on the approach by Bajard et al. (BBE1) and
includes the operations of Algorithm 9 and the operations
of Algorithm 3, Steps 3, 5, 6, and 7.

Number of Modular Multiplications

In [4] and [5], the metric used to evaluate the performance of
the proposed approach was based on the number of modular
multiplications needed (as shown in Table 3). The reorganized
RNS MM algorithms requires 3k modular multiplications
less than the state-of-the-art algorithms with the
same BE technique. The reorganized RNS ME requires
2k additional modular multiplications. Considering an RSA
scenario with an exponent of size g ¼ 1;024 and k ¼ 33, the
maximum number of modular multiplications would be
reduced from 4,938,450 to 4,735,566 (95.8 percent). Considering
a smaller base where g ¼ 640 and k ¼ 21, the maximum
number of modular multiplications would be reduced from
1,319,178 to 1,238,454 (93.9 percent). The percentage reduction
is slight better with a smaller base size, since the relative
weight of 3k (i.e., the reduction) over k2 (i.e., main contribution
to the total cost) is greater.

ARCHITECTURAL STUDY

In [6], Nozaki et al. proposed a cell architecture suitable to
the algorithm presented in [4]. The cell is matched to a single
element per RNS base; hence, a set of k cells is required to
execute the exponentiation. A cell can be matched to more
than one element per base, in order to reduce the area
(though increasing the delay). The cell is composed by a
three-stage pipelined Modular Multiplier and Accumulator
Unit (MMAU), a Cox unit (that evaluates and corrects the
BE error) and some memory elements. The MMAU used by
Nozaki et al. is also suitable for the approach proposed by
Bajard et al.; nevertheless, in this case, an additional
redundant cell matched to the redundant base element is
used, instead of the Cox unit.

Proposed Four-Stage MMAU Architecture

An efficient cell can be obtained by introducing redundancy,
increasing the level of pipeline, and moving the accumulation
from the MAU to the FMRU. By using a carry-save
representation, the final adder of the MAU and of the FMRU
can be removed. Based on theoretical observations, the
SMRU can be split in two units, thus augmenting the degree
of pipelining. The area and the delay of the four-stage
architecture can be reduced by moving the addition
operation in the FMRU, thus limiting the number of input
lines of the MAU and of the FMRU.

CONCLUSION

In this paper, an algorithmic and architectural study on
RNS ME has been presented. The opportunities that come
from the reorganization of the operations and of precomputations
have been investigated. In the architectural
study, five new cell architectures have been presented, by
exploiting well-known design techniques emerging from
computer arithmetic and by considering efficient RNS
bases. All the algorithms and the cell architectures have
been compared. Results have shown that, by rearranging
operations and preprocessing, the delay can be significantly
reduced. Moreover, the analysis carried out on the cells for
efficient bases has shown that their performance are
particularly interesting only when their constraints are
strict, hence limiting their applicability.

**jaseela123** · 08-09-2017, 11:28 AM

Modular exponentiation in large numbers is computationally intensive. One effective way to perform this operation is to use Montgomery's exponentiation in the Waste Number System (RNS). This paper presents an algorithmic and architectural study of this exponentiation approach. From the algorithmic point of view, the new and avant-garde opportunities that come from the reorganization of operations and precomputations are considered. From the architectural point of view, the design opportunities offered by well-known computer arithmetic techniques are studied, with the aim of developing an efficient arithmetic cell architecture. In addition, since the use of efficient RNS bases with a low Hamming weight is being considered with increasing interest, four additional cell architectures adapted specifically to these bases are developed and the balance between benefits and drawbacks carefully studied. A comprehensive comparison is presented between all algorithmic approaches and cell architectures considered, with the aim of providing the reader with a broad view of the Montgomery exponential opportunities in RNS.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Software Crisis pdf	study tips	1	2,117	21-09-2017, 04:31 PM Last Post: jaseela123
	Design and Analysis Of Algorithms : Seminar Report and PPT	seminar projects maker	1	1,315	21-09-2017, 12:04 PM Last Post: jaseela123
	HOW EMAIL WORKS pdf	project girl	1	3,067	20-09-2017, 11:39 AM Last Post: jaseela123
	Cyber crime detection, investigation and prosecution pdf	seminar projects maker	1	958	20-09-2017, 11:31 AM Last Post: jaseela123
	Review: Context Aware Tools for Smart Home Development pdf	study tips	1	1,227	20-09-2017, 11:22 AM Last Post: jaseela123
	Getting Started with the MAXQ1103 Evaluation Kit and the CrossWorks Compiler pdf	project girl	1	969	15-09-2017, 03:11 PM Last Post: jaseela123
	Wireless Application Protocol (WAP) pdf	project girl	1	1,531	15-09-2017, 02:42 PM Last Post: jaseela123
	MAC Protocol for Reliable Multicast over Multi-Hop Wireless Ad Hoc Networks pdf	study tips	1	1,029	15-09-2017, 12:39 PM Last Post: jaseela123
	Wireless Automotive Communications pdf	seminar projects maker	1	637	14-09-2017, 01:27 PM Last Post: jaseela123
	Enabling Secure and Efficient Ranked Keyword Search over Outsourced Cloud Data pdf	study tips	1	2,018	13-09-2017, 12:59 PM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.