12-06-2012, 01:39 PM
Lecture Notes on Programs as Data: The JVM
Lecture Notes on.pdf (Size: 125.36 KB / Downloads: 70)
Introduction
A recurring theme in computer science is to view programs as data. For example,
a compiler has to read a program as a string of characters and translate
it into some internal form, a process called parsing. You learn about
parsing in Assignment 3. Another instance are first-class functions, which
you will study in great depth in 15–150, a class dedicated to functional programming.
When you learn about computer systems in 15–213 you will see
how programs are represented as machine code in binary form.
In this lecture we will take a look at a virtual machine. In general, when
a program is read by a compiler, it will be translated to some lower-level
form that can be executed. For C and C0, this is usually machine code. For
example, the cc0 compiler you have been using in this course translates the
input file to a file in the C language, and then a C compiler (gcc) translates
that into code that can be executed directly by the machine. In contrast, Java
implementations typically translate into some intermediate form called byte
code which is saved in a class file. Byte code is then interpreted by a virtual
machine called the JVM (for Java Virtual Machine). So the program that
actually runs on the machine hardware is the JVM which interprets byte
code and performs the requested computations.
A Stack Machine
The JVM is a stack machine, which is why we are introducing it as this point
in the course, just after we have covered stacks. This means that the evaluation
of expressions uses a stack, called the operand stack. It is written from
left to right, with the rightmost element denoting the top of the stack.
We begin with a simple example, evaluating an expression without
variables:
Local Variables
Next, we add the ability to handle function arguments and local variables
to the machine. For that purpose, a function has an array V containing local
variables. We can push the value of a local variable onto the operand
stack with the iload instruction, and we can pop the value from the top of
the stack and store it in a local variable with the istore instruction. Initially,
when a function is called, its arguments x1, . . . , xn are stored as local
variables V [1], . . . , V [n]. The variable 0 is unused in our version of the machine;
for Java it holds the reference to this, the object on which a method
is invoked.
Implementing JVM00
We now turn to implementing a tiny fragment of the JVM, which we call
JVM00. You may consult the complete code in the file jvm00.c0. Many
additional constructs in the JVM are concerned with different forms of data,
such as floating point numbers, or arrays, or objects. Method calls take
some thought and introduce a second form of stack, the call stack into the
machine.
Conditional Branches and Jumps
We introduce a few more byte code instructions so we can represent more
interesting programs. For example, say, we want to compile the following
(incorrect!) integer square root function from the first lecture: