A Lock-Free, Cache-Efficient Multi-Core Synchronization Mechanism for Line-Rate

**seminar flower** · 28-09-2012, 12:12 PM

A Lock-Free, Cache-Efficient Multi-Core Synchronization Mechanism for Line-Rate Network Traffic Monitoring

.pdf

A Lock-Free, Cache-Efficient.pdf (Size: 248.96 KB / Downloads: 43)

Abstract

Line-rate data traffic monitoring in high-speed
networks is essential for network management. To satisfy the
line-rate requirement, one can leverage multi-core architectures
to parallelize traffic monitoring so as to improve information
processing capabilities over traditional uni-processor architec-
tures. Nevertheless, realizing the full potential of multi-core
architectures still needs substantial work, especially in the
face of the ever-increasing volume and complexity of network
traffic. This paper addresses the issue through the design
of a lock-free, cache-efficient synchronization mechanism that
serves as a basic building block for a general class of multi-
threaded, multi-core traffic monitoring applications. We embed
the synchronization mechanism into MCRingBuffer, a multi-
core shared ring buffer that provides fast data accesses among
threads running in different cores. MCRingBuffer allows con-
current lock-free data accesses and improves the cache locality
of accessing the control variables that are used for thread
synchronization.

INTRODUCTION

Monitoring the behavior of data traffic in communication
networks is essential for a variety of network management
applications such as accounting, resource provisioning, failure
diagnosis, and intrusion detection. One crucial requirement
of network traffic monitoring is to process packets at
line rate, meaning that the packet processing speed must
keep up with the bandwidth of the communication link
where data traffic is monitored. However, as the volume
of network traffic continuously surges, so does the link
bandwidth, typically to Gigabit-per-second scales. Also,
as networking applications are getting more diversified,
data traffic contains more sophisticated information that
further complicates the monitoring process. Conventional
uni-processor architectures may no longer support line-rate
traffic monitoring, and this motivates the need of more
advanced architectures.

MCRINGBUFFER

We describe MCRingBuffer, a shared ring buffer tailored
for multi-core architectures. We start by describing an existing
concurrent lock-free ring buffer that we use as a building
block for MCRingBuffer. Then we present MCRingBuffer,
including its design, sketch proof of correctness, and evaluation
results.

Background

Ring buffer design, also known as the producer/consumer
problem, is a classical problem in many introductory operating
systems textbooks. In the problem, we have a bounded
buffer with a fixed number of slots. A producer inserts
elements to the buffer only when the buffer is not full, while
a consumer extracts elements from the buffer only when the
buffer is not empty. In addition, the first-in-first-out (FIFO)
property needs to be guaranteed, meaning that the elements
extracted by the consumer appear in the same order as the
elements inserted by the producer.

Design

MCRingBuffer is built upon BasicRingBuffer to support
concurrent lock-free accesses, and enhances BasicRing-
Buffer with a key objective to improve the cache locality
of thread synchronization. In particular, MCRingBuffer is
a software-based data structure without any hardware synchronization
primitives. MCRingBuffer de-couples the data
and control operations as in BasicRingBuffer, so that it
can transfer data elements of general data types. Also, the
producer and consumer threads do not need to specifically
schedule their insert and extract operations to maximize the
performance gain of MCRingBuffer.
We make several assumptions for MCRingBuffer, while
these assumptions are inherited from BasicRingBuffer. First,
the single-producer/single-consumer model is used. Also,
reading and writing the shared control variables read and
write, which are of integer type, are indivisible, atomic
operations. In general, atomic operations on 32-bit integers
are guaranteed in modern 32/64-bit CPUs such as Intel (see
[5], Chapter 8). Furthermore, the memory model satisfies
sequential consistency [11], in which the order of operations
executed by all threads preserves the order of operations
specified by each individual thread.

RELATED WORK

In Section II-A2, we review different lock-free ring
buffers proposed in the literature. Here, we focus on reviewing
existing parallel network traffic monitoring schemes.
Kruegel et al. [7] propose a parallel, stateful intrusion
detection architecture that partitions and processes traffic on
different PCs (called sensors). Foschini et al. [1] extend [7]
with a parallel matching algorithm for stateful signatures.
Weaver and Sommer [19] propose an intrusion detection
system using a cluster of end hosts. Paxson et al. [12]
sketch a network monitoring system tailored for multi-core
architectures, and their main focus is on how to dispatch
packets to the analysis threads on different cores. Wang et
al. [18] apply the variant of FastForward [2] and show that
multi-core-based processing can boost the packet processing
speed to Gigabit levels. Similar work of multi-core-based
traffic monitoring is also found in [13], [20], whose major
focus is on the algorithmic design, such as flow classification
[13] and load balancing [20]. In contrast, our work on
MCRingBuffer mainly addresses a shared data structure that
minimizes the cost of inter-core (or inter-thread) communication
and applies to general multi-threaded, multi-core
network traffic monitoring systems.

CONCLUSIONS AND FUTURE WORK

We propose a basic primitive that enables line-rate network
traffic monitoring. We present MCRingBuffer, a lockfree,
cache-efficient ring buffer that achieves efficient thread
synchronization in multi-core architectures. MCRingBuffer
improves the cache locality of thread synchronization via:
(i) cache-line protection and (ii) batch updates of control
variables. Also, MCRingBuffer is a software-based data
structure that supports generic data types of elements.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Ranked, Efficient and Secure Keyword search over encrypted cloud data PPT	seminar post	1	814	21-09-2017, 11:55 AM Last Post: jaseela123
	MAC Protocol for Reliable Multicast over Multi-Hop Wireless Ad Hoc Networks pdf	study tips	1	1,029	15-09-2017, 12:39 PM Last Post: jaseela123
	Enabling Secure and Efficient Ranked Keyword Search over Outsourced Cloud Data pdf	study tips	1	2,018	13-09-2017, 12:59 PM Last Post: jaseela123
	Green Computing for Efficient use of Energy and Electronic Waste Minimization Report	project girl	1	1,357	12-09-2017, 12:37 PM Last Post: jaseela123
	Implement Remote Procedure Call (RPC) mechanism for a file transfer across a network	seminar tips	1	2,095	09-09-2017, 01:11 PM Last Post: jaseela123
	A Multi-Dimensional Approach to Internet Security	Electrical Fan	0	9,063,061	25-08-2017, 09:32 PM Last Post: Electrical Fan
	Optical Free Space Communication	Computer Science Clay	0	10,364,880	25-08-2017, 09:32 PM Last Post: Computer Science Clay
	Free Space Optics	Computer Science Clay	0	12,463,887	25-08-2017, 09:32 PM Last Post: Computer Science Clay
	On-line Analytical Processing (OLAP)	computer science crazy	0	20,240,233	25-08-2017, 09:32 PM Last Post: computer science crazy
	Adaptive Replacement Cache Full Download Seminar Report and Paper Presentation	computer science crazy	0	15,494,625	25-08-2017, 09:32 PM Last Post: computer science crazy

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.