03-01-2013, 12:53 PM
Semantic Analysis
1Semantic Analysis.pdf (Size: 110.63 KB / Downloads: 58)
What Is Semantic Analysis
Parsing only verifies that the program consists of tokens arranged in a syntactically
valid combination. Now we’ll move forward to semantic analysis, where we delve even
deeper to check whether they form a sensible set of instructions in the programming
language. Whereas any old noun phrase followed by some verb phrase makes a
syntactically correct English sentence, a semantically correct one has subject-verb
agreement, proper use of gender, and the components go together to express an idea
that makes sense. For a program to be semantically valid, all variables, functions,
classes, etc. must be properly defined, expressions and variables must be used in ways
that respect the type system, access control must be respected, and so forth. Semantic
analysis is the front end’s penultimate phase and the compiler’s last chance to weed out
incorrect programs. We need to ensure the program is sound enough to carry on to
code generation.
A large part of semantic analysis consists of tracking variable/function/type
declarations and type checking. In many languages, identifiers have to be declared
before they’re used. As the compiler encounters a new declaration, it records the type
information assigned to that identifier. Then, as it continues examining the rest of the
program, it verifies that the type of an identifier is respected in terms of the operations
being performed. For example, the type of the right side expression of an assignment
statement should match the type of the left side, and the left side needs to be a properly
declared and assignable identifier. The parameters of a function should match the
arguments of a function call in both number and type. The language may require that
identifiers be unique, thereby forbidding two global declarations from sharing the same
name. Arithmetic operands will need to be of numeric—perhaps even the exact same
type (no automatic int-to-double conversion, for instance). These are examples of the
things checked in the semantic analysis phase.
Type Checking
Type checking is the process of verifying that each operation executed in a program
respects the type system of the language. This generally means that all operands in any
expression are of appropriate types and number. Much of what we do in the semantic
analysis phase is type checking. Sometimes the rules regarding operations are defined
by other parts of the code (as in function prototypes), and sometimes such rules are a
part of the definition of the language itself (as in "both operands of a binary arithmetic
operation must be of the same type"). If a problem is found, e.g., one tries to add a char
pointer to a double in C, we encounter a type error. A language is considered stronglytyped
if each and every type error is detected during compilation. Type checking can be
done compilation, during execution, or divided across both.
Case Study: ML Data Type
ML, or Meta-Language, is an important functional language developed in Edinburgh in
the 1970’s. It was developed to implement a theorem prover, but recently, it has gained
popularity as a general purpose language. ML deals with data types in a novel way.
Rather than require declarations, ML works very hard to infer the data types of the
arguments to functions.
Implementation
The first step in implementing a type checker for a compiler is to record type
information for each identifier. All a scanner knows is the name of the identifier so that
it what is passed to the parser. Typically, the parser will create some sort of
"declaration" record for each identifier after parsing its declaration to be stored for later.
On encountering uses of that identifier, the semantic analyzer can lookup that name and
find the matching declaration or report when no declaration has been found.
Type Equivalence of Compound Types
The equivalence of base types is usually very easy to establish: an int is equivalent only
to another int, a bool only to another bool. Many languages also require the ability to
determine the equivalence of compound types. A common technique in dealing with
compound types is to store the basic information defining the type in tree structures.