06-11-2012, 05:02 PM
Parallel Random Number Generators in Java
1Parallel Random Number.pdf (Size: 1.11 MB / Downloads: 46)
Abstract
Scienti¯c computing has long been pushing the boundaries of computational re-
quirements in computer science. An important aspect of scienti¯c computing is the
generation of large quantities of random numbers, especially in parallel to take advan-
tage of parallel architectures. Many science and engineering programs require random
numbers for applications like Monte Carlo simulation. Such an environment suitable
for parallel computing is Java, though rarely used for scienti¯c applications due to its
perceived slowness when compared to complied languages like C. Through research and
recommendations, Java is slowly being shaped into a viable language for such computa-
tional intense applications. Java has the potential for such large scale applications, since
it is a modern language with a large programmer base and many well received features
such as built-in support for parallelism using threads. With improved performance from
better compilers, Java is becoming more commonly used for scienti¯c computing but
Java still lacks a number of features like optimised scienti¯c software libraries. This
project looks at the e®ectiveness and e±ciency of implementing a parallel random num-
ber library in Java using threads, and explores the options for creating a high-quality
parallel generator. The parallel random number generator library extends the current
java.util.Random to add features, like generator selection, and has been implemented
as a set of high-quality generators that can be used sequentially or in parallel with-
out requiring synchronisation. The implementation is e±cient with a selection of tests
verifying both e±ciency and e®ectiveness. This project has a viable parallel Random
API implementation that can be used in parallel scienti¯c applications e±ciently and
e®ectively, unlike the current standard Java random generators.
Introduction
Science has always been on the forefront of advancing the state of computers, through
both technology and computer applications. Scienti¯c applications, or more speci¯-
cally scienti¯c computing, have widely been accepted to be both computational and
resourcefully intense throughout the history of computing. The Monte Carlo method is
a highly computationally intensive scienti¯c application that requires random numbers
in large quantities for calculations [31]. Random numbers are a large part of scienti¯c
computing with applications not only in Monte Carlo simulations but also modelling
random processes [35]. It has been realised that random number generation is a con-
cept of fundamental importance in many areas of computer science, especially scienti¯c
computing [28].
There are many important aspects to random numbers in the computer environ-
ment. Random numbers are generated using deterministic functions to give desirable
properties to the programmer [29]. However random numbers need to be generated
carefully to ensure de¯ned random properties still hold given the deterministic nature
of the generating functions [2]. Some time ago computational science moved into par-
allel computing to obtain greater program speedup in scienti¯c applications. Random
numbers generated in parallel must be generated with even greater care. This is due to
relationships that can occur between random numbers on di®erent processes that can
destroy the needed random properties.
Background
Random numbers and their use in scienti¯c computations have evolved together with
modern computers. Pioneered in the early days, from their use in the Manhattan project
[31], random number generators have become an important part of scienti¯c comput-
ing, with applications in statistical experiments, simulation of stochastic systems, and
numerical analysis with Monte Carlo methods.
Originally, truly random numbers were produced from extra hardware such as a
geiger counter or noise diodes [29]. These were soon discovered to be impractical and
unreliable mainly due to the successive numbers produced, not being independent and
uniformly distributed or repeatable [13]. This lead to the use of random numbers
generated from deterministic functions to obtain random numbers that, while not truly
random, are independent and uniformly distributed.
The aim of this project is to develop a parallel random number generator library for
Java, following the standard API, that produces non-overlapping sequences of random
numbers for each thread. The sequences should be generated independently by each
thread so there is no synchronisation or shared memory access overhead required. With
the exception of perhaps in the initialisation of the generators on each thread. The
generators implemented also need to be of a standard that could be used in a scienti¯c
computing application. Therefore it is important to understand the nature of random
numbers and their generation.
Random Numbers
Created from deterministic functions that produce periodic sequences of numbers, ran-
dom number generators must be carefully selected and implemented to ensure the ran-
domness of the sequence created. A random number generator is a structure where an
input state is transformed to an output state by a transition function given a probability
distribution [29]. The input state is known as the seed. The goal of random number
generation is to imitate independent uniform random variables [12]. Since the random
numbers are produced in a ¯nite state machine of size 2n (where n is the number of
bits used to represent the generated number)
Parallel Random Numbers
The current generation of supercomputers and high performance computers are pri-
marily parallel architectures that rely on parallel speedup to obtain peak performance.
With these high performance parallel machines being used in scienti¯c computations,
such as Monte Carlo simulations [35], there is a need for parallel implementations of
random number generators that can provide the quality of their sequential counterparts,
in a parallel environment.
Unlike the parallelisation of sequential applications, the main goals of parallelising
random number generators is not speedup. The most important goal of a parallel
random number generator is the creation of independent sequences on separate processes
with little to no communication [35]. It is paramount to develop a parallel generator that
avoids the current behaviour of the standard Java generator, that is synchronous. Using
a synchronous generator in parallel would limit the speedup of parallel applications using
the generator. As stated in Amdahl's Law, a program's speedup is limited to 1
(1¡¯) ,
where ¯ is the sequential proportion of the program [1]. An important goal of a parallel
random number generator is to have a large period length so that all the sequences
executing in parallel do not exhaust the cycle in an unacceptable time frame.
Central Server
Discussed by Srinivasan [35], the central server parallel implementation creates a cen-
tralised random number generator that serves all processes with random numbers. This
implementation is far from ideal as the communication and synchronisation costs far
out weight any bene¯ts from randomness properties achieved by this method. This is
essentially how the standard Java random number generator behaves.
Parameterised Iteration
Proposed by Srinivasan [35], this method of parallelisation implements a parallel gener-
ator by creating new generation functions on each processes created. Here, process i is
given the generator function Ti. This function Ti is the same operation of the genera-
tor on each process, however each process has di®erent parameters associated with the
function. These parameters must be carefully chosen to ensure the randomness prop-
erties hold for each sequence. A viable method for parameterised iteration is discussed
in section 4.2.1.