A memcached implementation in JGroups

**seminar flower** · 27-08-2012, 10:56 AM

A memcached implementation in JGroups

.pdf

memcached.pdf (Size: 170.35 KB / Downloads: 59)

Memcached

Memcached is a widely used cache, which can be distributed across a number of hosts. It is is a
hashmap storing key/value pairs. Its main methods are set(K,V) which adds a key/value pair, get(K)
which returns a value for a previously inserted key and delete(K) which removes a key/value pair.
Memcached is started on a given host and listens on a given port (11211 being the default). The
daemon is written in C, but clients can be written in any language and talk to the daemon via the
memcached protocol.
Typically, multiple memcached daemons are started, on different hosts. The clients are passed a list of
memcached addresses (IP address and port) and pick one daemon for a given key. This is done via
consistent hashing, which always maps the same key K to the same memcached server S. When a
server crashes, or a new server is added, consistent hashing makes sure that the ensuing rehashing is
minimal. Which means that most keys still map to the same servers, but keys hashing to a removed
server are rehashed to a new server.
Memcached does not provide any redundancy (e.g. via replication of its hashmap entries); when a
server S is stopped or crashes, all key/value pairs hosted by S are lost.
The main goal of memcached is to provide a large distributed cache sitting in front of a database (or
file system). Applications typically ask the cache for an element and return it when found, or else ask
the DB for it. In the latter case, the element is added to the cache before returning it to the client and
will now be available from the cache on the next cache access.
This speeds up applications with good cache locality (e.g. fetching web pages comes to mind), because
even a round trip in a LAN is typically faster then a round trip to the DB server, plus the disk access for
reading a SELECT statement.
In addition, clients now have a huge aggregated cache memory. If we start 10 memcached daemons of
2GB memory each, then we have a 20GB (virtual) cache. This is bigger than most physical memory
sizes of most hosts today (2008).
Figure 1 shows a typical use of memcached. The example shows an instance of Apache which serves
HTTP requests and a Python client running in the mod_python Apache module. There are 3 instances
of memcached started, on hosts A, B and C. The Python client is configured with a list of [A,B,C].
When a request hits the Python code that requires a DB lookup, the Python code first checks the cache
by hashing the key to a server. Let's assume K is hashed to C, so now the client sends a GET(K) request
to C via the memcached protocol and waits for the result. If the result is not null, it is returned to the
caller as an HTTP response. If not, the value for K is retrieved from the DB, inserted into the cache (on
host C) and then returned to the client. This ensures that the next lookup for K (by any client) will find
K,V in the cache (on host C) and does not have to perform a DB lookup.

The memcached protocol

PartitionedHashMap can also be accessed via the memcached protocol, so existing clients continue
working with it.
The setup would be the same as shown in Fig. 1, except that the backend would not consist of
memcached servers, but of JGroups MemcachedServer instances.
Note that the full protocol has not been implemented, but only GET (single and multi), SET, DELETE
and STATS. Adding APPEND, PREPEND, CAS etc is simple, so if you need these urgently, post this
on the JGroups dev mailing list and I'll see that it gets added.
The Python and Java clients work fine.
The Java performance test (MemCachedBench) with 10000 SET, GET and REMOVE requests against
4 memcached instances on the one hand and 4 PartitionedHashMap instances on the other hand shows
ca. 4 secs for GETs, 2.8 for GETs, 200ms for multi gets and 1.8 secs for removes.
The biggest overhead by far is the memcached protocol, in the above test, the protocol took ca 90% of
the entire time of the test run.
The current memcached implementation is ca 10% faster than PartitionedHashMap, but - again - that
diff mainly lies in the memcached protocol. A setup like the one shown in Fig. 3 should be really fast.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Design and Implementation of Secure Auditing System in Linux Kernel	seminar ideas	1	2,308	06-09-2017, 12:24 PM Last Post: jaseela123
	Implementation of Digital Watermarking	nit_cal	0	7,500,150	25-08-2017, 09:32 PM Last Post: nit_cal
	Implementation of faster RSA key generation on smart cards	nit_cal	0	10,754,521	25-08-2017, 09:32 PM Last Post: nit_cal
	P3P Implementation Using Database	nit_cal	0	9,813,970	25-08-2017, 09:32 PM Last Post: nit_cal
	The design and implementation of a virtual machine operating system using a Virtual A	Computer Science Clay	0	6,726,055	25-08-2017, 09:32 PM Last Post: Computer Science Clay
	Implementation of an effective video transmission scheme over ad hoc networks	Electrical Fan	0	9,286,992	25-08-2017, 09:32 PM Last Post: Electrical Fan
	implementation of Shannon-Fano encoding algorithm	dhanabhagya	0	280	21-01-2016, 10:48 AM Last Post: dhanabhagya
	Implementation Of Zoom FFT	presentation Abstract	0	318	26-05-2015, 04:09 PM Last Post: presentation Abstract
	Implementation of AES DECRYPTION in HDL ABSTRACT	mkaasees	0	633	29-10-2014, 04:30 PM Last Post: mkaasees
	Implementation of Adaptive Viterbi Decoder Seminar Report	seminar code	0	599	09-09-2014, 10:06 AM Last Post: seminar code

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.