Wednesday, April 21, 2010

A Word About Caching: Memcached and APC

Sometimes when talking with developers I see that there are some misconceptions regarding this two caching systems. That's why I'd like to share some concepts I've learned along the way.

What does APC do?

APC has two main features, one is to cache the PHP opcodes, to speed up the page delivery. This means that our PHP code doesn't have to be parsed every time it gets executed. The other one is to cache user variables, so you can cache the results of expensive_function() in APC, using apc_store, for example, and apc_fetch to get the value back without needing to execute the expensive_function() function again.

A common technique employed by frameworks like symfony is to parse some configuration files, like YAML ones, and the var_export them into a PHP file. Something like:


<?php
$some_config = array('value_a', 'value_b');
?>


So later the framework just includes that file avoiding to parse the YAML file twice. This is technique is just file caching, but if you happen to have APC enabled, then you can benefit from the fact that the opcodes were cached by APC. So:

Misconception #1:

That APC caches the opcodes doesn't mean that loading whatever that file has won't be expensive performance wise. As an example, let's say that the framework caches a big array of configuration values. That array has to be loaded in memory again. AFAIK that can't be avoided. So, that APC caches the opcodes, doesn't mean that reloading it comes for free.

And what about Memcached?

Misconception #2:

Another misconception that I've heard is that Memcached is fast and it's speed is compared with hell's speed. That doesn't mean that is faster than accessing something from the PHP process memory, which is the case with APC. Every time you retrieve a value from Memcached, it has to perform a TCP roundtrip to get that value, besides opening a TCP connection to the server per request if you are not using persistent connections.

So while Memcached is fast, I won't recommend to use it for small values that are frequently accessed. APC can work just fine there –taking into account the size of the values of course–. So in the case of symfony, we use Memcached to cache view templates, results from queries, and things like that. But in the case of the routing generation calls, like the ones for url_for or link_to, I would prefer APC, which is what we use for the routing configuration.

Keep in mind that this is based on my experience, so take this for what it is and of course I'll love to read your comments about this topic here.