Improving the Performance of Shared Memory Communication in Impulse C pdf

**study tips** · 22-06-2013, 03:21 PM

Improving the Performance of Shared Memory Communication in Impulse C

.pdf

Improving the Performance.pdf (Size: 301.67 KB / Downloads: 36)

Abstract

With the evolution of field-programmable gate arrays
(FPGAs) to the Million-Gate scope, high-level languages are
gaining popularity in electronic system design, which greatly improves
design and verification efficiency. Impulse C is a high-level
language widely used in software/hardware (SW/HW) codesign
and provides users with varies SW/HW communication mechanisms.
But the communication mechanisms of Impulse C are
mainly designed for versatility, and the resources within the FPGA
chip is not fully utilized. In this letter, we present a improved
implementation of the shared memory communication in Impulse
C by utilizing both ports of the dual-port BRAM. Experiment results
show that the improved implementation can greatly improve
the performance of shared memory communication, and further
improve the execution efficiency of hardware processes.

INTRODUCTION

WITH the development of deep submicron technology,
millions of gates can be integrated on a single fieldprogrammable
gate array (FPGA) chip, the design of large scale
application systems using FPGA becomes possible. Flexible
software modules and high-performance hardware modules
are usually combined to implement sophisticated high-performance
embedded systems [1], [2]. Traditionally, FPGA-based
hardware modules are designed either by hardware description
languages, such as very high speed integrated circuit hardware
description language (VHDL) and Verilog HDL, or by
GUI-based approaches in which function units are specified by
functional blocks. These design methodologies have two major
problems. First, system designers are intensively involved in
the design process and proficiency in hardware description
languages is mandatory, so the methodologies cannot scale to
the design of complex application systems that typically utilize
millions of gates. Second, traditional design methodologies
are hard to meet the needs of software/hardware (SW/HW)
codesign and coverification.

THE IMPROVED SHARED MEMORY COMMUNICATION

In this section, we first present the original hardware architecture
of the shared memory communication of Impulse C.
Then the modification to the hardware architecture of the original
shared memory communication mechanism is introduced.
The implementation of SMCI and the integration of SMCI to
the CoDeveloper development environment is detailed as well.
A. The Original Hardware Architecture and Its Limitation
The original hardware architecture of the shared memory
communication of Impulse C is illustrated in Fig. 1. External
memory and internal BRAM are all shared memory, and Impulse
C requires that all memory modules must be connected to
OPB. If a software process that runs on the PowerPC core tries
to access the shared memory, it has to access processor local
bus (PLB), OPB, and “Opb BRAM if cntlr” sequentially (the
dashed line in Fig. 1). Hardware processes are custom IP Cores
represented as PLB slave and OPB master. A hardware process
must use the shared memory access controller “Opb dma” to
access shared memory via OPB and “Opb BRAM if cntlr”
(the dotted line in Fig. 1). The “Communication Interface” is
the connection between PLB and user logic. Two unidirectional
communication channels that implement the Impulse
C signal mechanism are designed to make a duplex channel
to synchronize software processes and hardware processes.
Hardware processes get the base address of shared memory via
data stream.

CONCLUSION AND FUTURE WORK

In this letter, we present an improved implementation of the
shared memory communication in Impulse C by utilizing the
dual ports of BRAM. A new shared memory interface controller
is designed and integrated in the CoDeveloper environment,
which is transparent to application designers. Experimental
results on the Xilinx PowerPC platform show that the shared
memory access performance and the execution efficiency of
hardware processes are greatly improved, which only introduces
very small resource overhead.
With the improved communication performance, the results
of HW/SW partitioning may be different (e.g., more components
could be implemented as HW since there is more memory
access bandwidth available). We would like to study the interplay
between communication performance and HW/SW partitioning
in our future work.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Software Crisis pdf	study tips	1	2,117	21-09-2017, 04:31 PM Last Post: jaseela123
	Performance Evaluation of Computer Networks	project uploader	1	3,234	21-09-2017, 12:32 PM Last Post: jaseela123
	HOW EMAIL WORKS pdf	project girl	1	3,067	20-09-2017, 11:39 AM Last Post: jaseela123
	Cyber crime detection, investigation and prosecution pdf	seminar projects maker	1	958	20-09-2017, 11:31 AM Last Post: jaseela123
	Review: Context Aware Tools for Smart Home Development pdf	study tips	1	1,227	20-09-2017, 11:22 AM Last Post: jaseela123
	Getting Started with the MAXQ1103 Evaluation Kit and the CrossWorks Compiler pdf	project girl	1	969	15-09-2017, 03:11 PM Last Post: jaseela123
	Wireless Application Protocol (WAP) pdf	project girl	1	1,531	15-09-2017, 02:42 PM Last Post: jaseela123
	MAC Protocol for Reliable Multicast over Multi-Hop Wireless Ad Hoc Networks pdf	study tips	1	1,029	15-09-2017, 12:39 PM Last Post: jaseela123
	Wireless Automotive Communications pdf	seminar projects maker	1	637	14-09-2017, 01:27 PM Last Post: jaseela123
	Enabling Secure and Efficient Ranked Keyword Search over Outsourced Cloud Data pdf	study tips	1	2,018	13-09-2017, 12:59 PM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.