Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: Seminar Report On Intel dual Core processor
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Seminar Report On Intel dual Core processor

[attachment=40845]

Abstract

The Intel® Core™ Duo processor is a new member of the Intel® mobile processor product line. It is the first Intel® mobile microarchitecture that uses CMP (Core Multi-Processor; i.e., multi cores on die) technology. Targeted to the market of general-purpose mobile systems, the Intel® Core™ Duo processor was built to achieve high performance, while consuming low power and fitting into different thermal envelopes.
In order to achieve the required performance, a CMP-based microarchitecture was designed to achieve power-efficient architecture, each performance improvement was evaluated against the power cost, and only the power-efficient performance features were implemented.
On top of that, special hardware mechanisms were added to better control the static and the dynamic power consumption. As a result, the Intel® Core™ Duo processor provides higher performance in the same form factors without needing to increase the cooling capability

INTRODUCTION

The Intel® Core™ Duo processor is a new member of the Intel® mobile processor product line. It is the first Intel® mobile microarchitecture that uses CMP (multi cores on die) technology. Building a general-purpose mobile core is a challenging task since, on the one hand, the system needs to maintain the highest level of performance, while on the other hand, the system must fit into different thermal envelopes, as illustrated by Figure 1, and improve power efficiency.
Intel® Core™ Duo is based on Pentium® M processor 755/745 core microarchitecture with few performance improvements at the level of each single core. The major performance boost is achieved from the integration of dual cores on the die (CMP architecture). This agrees with our assessment that continuing to improve single thread performance is rather costly in terms of power and may achieve diminishing returns in terms of efficiency, if major microarchitecture enhancements are not made The big potential for improved performance is through exploring parallelism between threads. However, the CMP architecture presents many challenges for power and thermal control to still fit into the mobility constraints.

The improved Pentium® M processor-based cores

The core of the Intel® Core™ Duo processor-based technology is an enhanced Pentium® M processor 755/7451 core converted to 65nm process technology. The main focus of the core enhancements was to do the following:
• Support virtualization (Virtualization Technology2) [3].
• Support the new Streaming SIMD Extension (SSE3) [4].
• Address performance inefficiencies mainly in the handling of SSE/SSE2, FP (x87) and some long latency integer instructions.
Intel® Core™ Duo processor-based technology core performance improvements
Intel® Core™ Duo processor-based technology introduces performance improvements in the following areas:
• Streaming SIMD Extensions (SSE/2/3)
• Floating Point (x87)
• Integer
The main difficulty with SSE implementation in Pentium M is caused by the fact that SSE/2/3 is a 128-bit wide microarchitecture while the Pentium M execution core is 64-bits wide (in order to meet power and energy constraints). Making the machine twice as wide may produce more heat and so will have a significant impact on the Thermal Design Point (TDP) of the system as well as some impact on battery life. Since the Pentium M was primarily designed for mobility we preferred to make it relatively narrow and cope with the SSE performance issues. The by-product of this tradeoff is that each SSE vector operation is "broken" into 64-bit wide micro-operation (uOp) pairs. Such instructions suffer from several performance bottlenecks in the Pentium M pipeline, mainly in the Front End (FE) of the pipeline. For example, the Instruction Decoder in the Pentium M processor can potentially handle three instructions per cycle but only the first decoder in a row is capable of handling complex instructions. The other two decoders are limited to single uOp instructions only. This works fine in most cases since the most frequent instructions are single uOp. However, this is not the case with SSE instructions: only scalar SSE operations are single uOps while the vector operations are typically 2-4 uOps. This results in several potential bottlenecks in the FE: the Instruction Decoder in the Pentium M can only handle one SSE vector operation per cycle, causing starvation in the rest of the machine. This bottleneck was addressed in the Intel® Core™ Duo core: a new mechanism was introduced that allows lamination of pairs of similar uOps. This mechanism along with enhanced uOp fusion allows handling of the SSE/2/3 vector operation by a single laminated uOp. The instruction decoders were modified to handle three such instructions per cycle, increasing significantly the decode bandwidth of SSE vector operations. The laminated uOps streaming down the pipe are at a certain point un-laminated, reproducing again the 64-bit wide uOp pairs to feed the machine. These changes not only improve performance of vector operations but also save some energy since the FE, no more a bottleneck, can be clock gated whenever its uOp buffer is filled beyond a certain watermark.

CMP-General structure

Intel® Core™ Duo processor-based technology implements shared cache-based CMP microarchitecture in order to maximize the performance of both ST and MT applications (assuming the same L2 cache size). Figure 3 describes the general structure of our implementation. The figure shows the following:
• Each core is assumed to have an independent APIC unit to be presented to the OS as a "separate logical processor."
• From an external point of view the system behaves like a Dual Processor (DP) system.
• From the software point of view, it is fully compatible with Intel® Pentium® 4 processors with Hyper-Threading3 (HT) Technology [6], and DP-based systems. However, special optimizations could be applied to improve the performance of the share-based cache organization.
• Each core has an independent thermal control unit (discussed later in this paper and also covered in [2]).
• The system combines per-core power state together with package-level power state.
The paper CMP Implementation in Intel® Core™ Duo Systems [1] extends the discussion on the CMP implementation and compares its performance with other configurations such as the use of split cache architecture. The results shown there indicate that the new proposed microarchitecture maximizes the performance benefits of both ST and MT execution at a given cache size. The enhancements we implemented in each of the cores allow us to improve both the ST performance (in specific cases) as well as the MT execution. It also allows us to improve the power and the thermal control of the system, and to achieve similar average power consumption, as was the case in the single-core Pentium® M processor.

Power control

Extending the battery life, while improving the performance, was one of the main goals in designing the Intel® Core™ Duo processor. Battery life is affected by dynamic power, caused when the processor is active, and by static power, which is the power wasted when a unit or the entire processor is not active. Intel® Core™ Duo microarchitecture saves both types of power.
Figure 4 describes the general process we followed in order to reduce the power during the development cycle of the Intel® Core™ Duo processor. As can be seen, the average power consumption was reduced by handling the problem at all different levels of the design, starting with adjusting the process technology through all the design stages of production.