Wednesday, April 21, 2010

A Word About Caching: Memcached and APC

Sometimes when talking with developers I see that there are some misconceptions regarding this two caching systems. That's why I'd like to share some concepts I've learned along the way.

What does APC do?

APC has two main features, one is to cache the PHP opcodes, to speed up the page delivery. This means that our PHP code doesn't have to be parsed every time it gets executed. The other one is to cache user variables, so you can cache the results of expensive_function() in APC, using apc_store, for example, and apc_fetch to get the value back without needing to execute the expensive_function() function again.

A common technique employed by frameworks like symfony is to parse some configuration files, like YAML ones, and the var_export them into a PHP file. Something like:

$some_config = array('value_a', 'value_b');

So later the framework just includes that file avoiding to parse the YAML file twice. This is technique is just file caching, but if you happen to have APC enabled, then you can benefit from the fact that the opcodes were cached by APC. So:

Misconception #1:

That APC caches the opcodes doesn't mean that loading whatever that file has won't be expensive performance wise. As an example, let's say that the framework caches a big array of configuration values. That array has to be loaded in memory again. AFAIK that can't be avoided. So, that APC caches the opcodes, doesn't mean that reloading it comes for free.

And what about Memcached?

Misconception #2:

Another misconception that I've heard is that Memcached is fast and it's speed is compared with hell's speed. That doesn't mean that is faster than accessing something from the PHP process memory, which is the case with APC. Every time you retrieve a value from Memcached, it has to perform a TCP roundtrip to get that value, besides opening a TCP connection to the server per request if you are not using persistent connections.

So while Memcached is fast, I won't recommend to use it for small values that are frequently accessed. APC can work just fine there –taking into account the size of the values of course–. So in the case of symfony, we use Memcached to cache view templates, results from queries, and things like that. But in the case of the routing generation calls, like the ones for url_for or link_to, I would prefer APC, which is what we use for the routing configuration.

Keep in mind that this is based on my experience, so take this for what it is and of course I'll love to read your comments about this topic here.


Anonymous said...

On Misconception #2:
What you are saying makes sense if you have a small setup ie. 2-4 web servers. If you have more servers you are just wasting memory. In my experience it is better to use apc to store small variables that don't change more than once a day. I believe apc has a default cache size of 32mb, not sure if this is still true today so you don't want to store large objects in there.

I wouldn't be too worried about "tcp roundtrip", get around it by using tricks like memcache multi-get so you make only one tcp request per page.

Alvaro said...

Sure, as I said, it totally depends on what you are caching and as you say, your set up of course. Still for small items I'll prefer APC.

But my goal is to avoid the thinking of "let's put everything into Memcached" or "everything into APC".

James Dempster said...

Helpful article that hopefully will help people realise there different uses.

We also use a mixture of both.

Generally if it's something that needs to be shared by more than one server because you don't know where the user is going land, we put it in memcache, things like sessions and user configuration.

If it's something that can be different or specific to each server then we store it in apc config details etc.

Unknown said...

Just to give you a rule of thumb for memcaching:
memcache is slower than caching on the filesystem! So for single servers it doesn't make any sense.
But as soon as you can share one memcached for more than one web server it might be worth thinking about.

Unknown said...

Facebook actually uses both APC and Memcache, for different purposes. They use APC to store configuration values, and any sort of globally available data. Then they use memcache to store everything else. Their savings on keeping commonly used items local to each web server is huge.