What is Memcached ?

2

Self Advertisement
—–Start of Advertisement——-
BUILD CAR POOL SOLUTIONS ON ANY DEVICE TO RUN ANYWHERE (www.mcruiseon.com). Introducing mCruiseOn, the java library /json api’s that you can use to build a car pool solution. Be the next avego.com, carpooling.com, zimride.com. mCruiseOn is your one stop API on EC2.
——End of Advertisement——-

Thats the point right, if something ain’t that easy to understand, how easy will it be to implement and maintain ? The challenge is to understand it in one go.

Memcached is a distributed memory object caching system. No, that is not the way I am going to explain it, so chill :).

We all know that our webservers have web applications that run database query’s to get data. They also call query’s to insert, delete and update data. As simple as it sounds, its crazy to get such a system working when millions of connections on the webserver get requests. Webservers need to be added in parallel to handle load, databases need to be replicated so that in parallel data can be accessed, and then starts the complication of ensuring that data is consistent across all the requests. Then comes the challenge of ensuring data is available quickly enough.

Bad examples are http://www.irctc.co.in. But its important to understand how difficult it is to get it right. Especially if your data is as complex as the Indian Railways. If this site is down, the nation stops.

Now, for some basic computer fundamentals. Just as a quick reminder. We all know that tape drives are slower than floppy disks. Floppy disks are slower than hard disks. Hard disks are slower than memory. And memory is slower than CPU memory. CPU memory is much more expensive than memory, and the story continues. Faster the memory the more expensive it will be.

What is serialization ? A object that can be stored on a disk byte by byte, and read back to form that object again, is called a Serializable object.

Now, imagine. If I store a file on a harddisk, and read it. Will it be faster if I read it from the memory ? Obviously. We dont use fread (file read, File.read()) for every operation we want to perform. We first read it into a String, and then read it from that String. Memcached does something similar. It stores all data given to it in memory. So that you can read from it faster as compared to a harddisk.

So, if you have a piece of code doing a lot of read from harddisks, (mind you databases are also stored in big harddisks), AND if that data is needed to be read multiple times, then you can write that data to memcached. So instead of reading data from the harddisk all reads can goto memcached. There by preventing the harddisk from the extra effort for each read. It can do something more important in that time.

Memcached provides you with a way to identify the data that you just wrote with a identifier, a name or a key. We call it key value pair.

When you load memcached, you need to specify the amount of memory that this machine will dedicate for memcached. And memcached reserves that much memory for itself (the -m command). So, what happens when u run out of memory.

Simple load memcached on another machine, and get both these memcached machines to know each other (this is called a cluster). Then you can refer to the cluster and add objects. Memcached will store objects in round robin between the servers on the cluster.

Now, your application needs to identify from its request if similar data is being  requested between its clients. If it is, then the first call can save that common data on memcached, and subsequent calls can retrieve the data from memcached.

Just remember, that memcached is a “explicit cache”. Which means you need to add stuff to memcached, remove it, updated it. It does “nothing” automatically. At this point many people get turned off, but the concept of memcached is to save your trip to the database, and reduce the need for clustering a database and webservers. Since clustering memcached is very cheap, as compared to database and webservers.

This should help you understand the fundamentals of memcached.

Advertisements

INFO net.spy.memcached.MemcachedConnection: Reconnecting {QA sa=0.0.0.0/0.0.0.0:11211,}

2

I kept getting this annoying message every 30 seconds (dont intend to offend anyone here :))

2012-02-09 13:55:59.322 INFO net.spy.memcached.MemcachedConnection:  Reconnecting {QA sa=0.0.0.0/0.0.0.0:11211, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0}
2012-02-09 13:55:59.323 INFO net.spy.memcached.MemcachedConnection:  Connection state changed for sun.nio.ch.SelectionKeyImpl@15f0688
2012-02-09 13:55:59.323 INFO net.spy.memcached.MemcachedConnection:  Reconnecting due to failure to connect to {QA sa=0.0.0.0/0.0.0.0:11211, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0}
java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
	at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:313)
	at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:199)
	at net.spy.memcached.MemcachedClient.run(MemcachedClient.java:1622)
2012-02-09 13:55:59.324 WARN net.spy.memcached.MemcachedConnection:  Closing, and reopening {QA sa=0.0.0.0/0.0.0.0:11211, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0}, attempt 26.

This is what I had running on my setup. All on the localhost
Server

prompt$ /usr/bin/memcached -m 64 -U 11211 -p 11211 -l 127.0.0.1
ps -ef | grep memcached shows
username  12029  2961  0 13:32 pts/2    00:00:00 /usr/bin/memcached -m 64 -U 11211 -p 11211 -l 127.0.0.1
sudo netstat -anp | grep 11211 shows
prompt$ sudo netstat -anp | grep 11211
tcp        0      0 127.0.0.1:11211         0.0.0.0:*               LISTEN      12029/memcached 
tcp        0      0 127.0.0.1:11211         127.0.0.1:36642         ESTABLISHED 12029/memcached 
tcp6       0      0 127.0.0.1:36642         127.0.0.1:11211         ESTABLISHED 13941/java      
udp        0      0 127.0.0.1:11211         0.0.0.0:*                           12029/memcached 
unix  2      [ ACC ]     STREAM     LISTENING     11211    1821/dbus-daemon    @/tmp/dbus-TelpJZAoJl

Client

private MyClass() throws UnknownHostException, IOException {
	super(new InetSocketAddress("127.0.0.1", 11211));
	}

Results
I am able to set/get with no issues at all. My cache populates and work perfectly well. Just that warning/info message on log4j.

To fix this, I loaded the memcaced without “-l 127.0.0.1”

prompt>$ /usr/bin/memcached -m 64 -U 11211 -p 11211

My memcached still works well, without that message.

WARN net.spy.memcached.transcoders.SerializingTranscoder: Caught IOException decoding bytes of data

1

I started to get a weird exception
java.io.InvalidClassException: package.com.something.Some_Hibernate_Table; no valid constructor
at java.io.ObjectStreamClass.checkDeserialize(ObjectStreamClass.java:730)
.
.
.
Some_Hibernate_Table

The stack trace was very weird, it hinted at a launch stack, and a exception from a table other than what I was sending data to. Infact after doing some debugging, step by step, this exception happened on a get on a SomeOther_Hibernate_Table.

Crazy root cause.. 🙂
Old code
List tableRow = null ;
if (isCacheable) {
tableRow value = cache.get(key) ;
return tableRow
}
String rowQuery = “Select * from ” + tableName + ” where ” + column + “=” + “\”” + key + “\””;
Query rowResult = currentSession().createSQLQuery(rowQuery);
tableRow = rowResult.list();

The cache returns a Object, but hibernate.createSQLQuery returns a List. So to keep things clean, the fix below.
Object value = cache.get(key) ;
if (isCacheable) {
if (value != null) {
tableRow = new ArrayList() ;
tableRow.add(value);
return tableRow;
}
}
String rowQuery = “Select * from ” + tableName + ” where ” + column + “=” + “\”” + key + “\””;
Query rowResult = currentSession().createSQLQuery(rowQuery);
tableRow = rowResult.list();