Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: An XML based intermediate language for a compiler infrastructure
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
An XML based intermediate language for a compiler infrastructure

[attachment=30545]

Introduction

In this work a XML based intermedi-
ate language for a compiler frontend
has been designed and implemented.
It has been integrated into a compiler
and has been based on an existing form
of intermediate language called ICode.
This has been done to grant a clean
separation between two of the compil-
ers phases and to make the internal
structures of ICode more obvious for
performance measurements and test-
ing.
The practical result of this work is an
interface that is able to export the data
structure ICode into XML and to reim-
port this XML files back to ICode.
XML was chosen because of its abil-
ity to represent treelike structures like
ICode. To detect inconsistencies XML-
schema is used to check the structure
of exported ICode against a schema.
I will start by explaining some basics
that are used in this work. Among
other tings I will give a short intro-
duction to XML and XML schema,
intermediate languages and structure
of ICode.

XML and XML
Schema

The intermediate language designed in
this work is an instance of the ”‘eX-
tensible Markup Language”’ (XML).
Since XML is a very popular data for-
mat nowadays and its concepts are eas-
ily understood and well known I will
not go into great detail with this.
The structure of the XML-documents
is described in the XML-schema lan-
guage. I will explain this language and
any of its constructs I will need in my
work.

XML Schema

To define the structure of the XML-
documents imported and exported in
this work I will use the XML schema
language which is itself an instance of
XML. With XML schema we can de-
termine which elements include which
subelements and in what quantities.
Attributes for elements can be de-
fined with different types of values and
occurrence constraints. XML-schema
has several advantages over the older
DTDs. It can define types, more ex-
pressive cardinality contraints and it
is itself notated in XML.

Purpose of intermediate
languages


The process of translating one source
language into a target language is di-
vided into several phases. There are
two main tasks that must be done
in sequence: analysis and synthe-
sis. These tasks are done in the so
called frontend (analysis phase) and
the backend (synthesis phase) of a
compiler. The output generated by
the frontend forms the input of the
backend and many compiler transform
this output into an intermediate rep-
resentation. This representation of the
source program is called intermediate
language and it is typically very close
to code that could be executed on an
abstract register machine. Although
importing and exporting of an interme-
diate language slows down the compi-
lation process it has several important
advantages. First of all the intermedi-
ate representation separates the back-
end from the frontend.

Abstract Syntax Trees

Abstract syntax trees or parse trees are
another common form of intermediate
language. The amount of information
in such a tree is reduced to get a more
efficient representation of the program.
In an Abstract syntax tree the leaf
nodes are variables or constants and
the non-leaf nodes are operators. A
traversal of such a tree can have a pre-
or postorder expressions as result. The
advantage of an abstract syntax tree is
that it can be easily restructured and
is therefore a good representation for
optimization.

The ICode intermediate
language


The compiler environment this work is
integrated into, already works with an
intermediate code representation that
is simply called Intermediate Code
(ICode). In this section I will shortly
explain some of the concepts that are
used in ICode to represent elements of
a programming language. The source
language for the compilation process
are diagrams that are for example used
to model the behavior of an embedded
system. The target language is C that
than can be compiled for a concrete
microprocessor architecture. This re-
sults in an intermediate representation
that has to be able to represent the C
programing language, general struc-
tures of software development and
some special elements that are used
to have an optimized representation
for embedded systems.

General structure

The general structure of the ICode
elements can be seen in figure . It
starts of with the root element of
the ICode tree called Application.
This Application-element is used
to represent a project from a soft-
ware development perspective. The
Application can contain several
Subapplications. At the moment
there is only one Subapplication
allowed per Application but the
concept of the ICode allows for
more. Subapplications represent
sub-projects and they contain so-
called Codeunits. The ICode is de-
signed to represent C and therefore a
Codeunit is an element that contains
the combination of a Header-File
and a Source-File in which the
Source-File is optional.