Showing posts with label php. Show all posts
Showing posts with label php. Show all posts

Wednesday, April 21, 2010

A Word About Caching: Memcached and APC

Sometimes when talking with developers I see that there are some misconceptions regarding this two caching systems. That's why I'd like to share some concepts I've learned along the way.

What does APC do?

APC has two main features, one is to cache the PHP opcodes, to speed up the page delivery. This means that our PHP code doesn't have to be parsed every time it gets executed. The other one is to cache user variables, so you can cache the results of expensive_function() in APC, using apc_store, for example, and apc_fetch to get the value back without needing to execute the expensive_function() function again.

A common technique employed by frameworks like symfony is to parse some configuration files, like YAML ones, and the var_export them into a PHP file. Something like:


<?php
$some_config = array('value_a', 'value_b');
?>


So later the framework just includes that file avoiding to parse the YAML file twice. This is technique is just file caching, but if you happen to have APC enabled, then you can benefit from the fact that the opcodes were cached by APC. So:

Misconception #1:

That APC caches the opcodes doesn't mean that loading whatever that file has won't be expensive performance wise. As an example, let's say that the framework caches a big array of configuration values. That array has to be loaded in memory again. AFAIK that can't be avoided. So, that APC caches the opcodes, doesn't mean that reloading it comes for free.

And what about Memcached?

Misconception #2:

Another misconception that I've heard is that Memcached is fast and it's speed is compared with hell's speed. That doesn't mean that is faster than accessing something from the PHP process memory, which is the case with APC. Every time you retrieve a value from Memcached, it has to perform a TCP roundtrip to get that value, besides opening a TCP connection to the server per request if you are not using persistent connections.

So while Memcached is fast, I won't recommend to use it for small values that are frequently accessed. APC can work just fine there –taking into account the size of the values of course–. So in the case of symfony, we use Memcached to cache view templates, results from queries, and things like that. But in the case of the routing generation calls, like the ones for url_for or link_to, I would prefer APC, which is what we use for the routing configuration.

Keep in mind that this is based on my experience, so take this for what it is and of course I'll love to read your comments about this topic here.

Monday, March 8, 2010

Erlang as a Fast Key Value Store for PHP

In this post I want to show you some of the neat things that can be done with the PHP-Erlang Bridge extension: A Key Value Store.

Erlang comes packed with a Key Value store in the form of the ETS module. This is database is pretty fast and efficient for storing the Erlang terms in memory.

I tried a proof of concept with the PHP extension and I obtained impressive results: Storing 150.000+ items in the ETS in 1 second! All that running on my Macbook Pro.

What I did was to write a PHP class wrapping the calls to the Erlang ETS module like for example:

public function insert($key, $value)
{
$x = peb_encode("[~a, {~a, ~s}]", array(array(
$this->name,
array($key, $value)
)));
$result = peb_rpc("ets", "insert", $x, $this->link);
return peb_decode($result);
}
Maps to this call in Erlang:
> ets:insert(tablename, {key, value}).
You can see the full code example here.

So here are the steps:

- Install the PEB extension from source
- Start Erlang with this Command: erl -sname node -setcookie abc
- Create the ETS table: ets:new(test, [set, named_table, public]).
- Save the gist to a file and run it: php ets.php

So while this is a very simple proof of concept, I just wanted to illustrate some of the cool things that can be done with this extension. For example the speed of encoding/decoding from Erlang to PHP is pretty decent as well as the communication speed.

Please let me know your thoughts about it in the comments.

Monday, February 1, 2010

Sharing Sessions Between PHP and Ejabberd

In my last post I wrote about a pet project I started to share sessions between PHP an Python. In this post I want to show you how we can share the sessions between PHP and Ejabberd.

So here's the problem. In one of the projects where we want to use XMPP we have a users databasase of around 2.5 millions users. We want that those users are able to login to our Ejabberd server using the same database. This means that every time a user logs into our site, we will query the database with PHP to see if he's allowed to login, and then ejabberd will query the
database again for the same purpose. Now, since the user is already authenticated in our PHP app, why don't we just share the session information with Ejabberd? Here's where InspectorD comes into play.

The first piece that we will use to solve this problem are Ejabberd external authentication scripts. In our case instead of authenticating against a database, we will user InspectorD to check whether a user is authenticated in our website. To do this we need to find some means of passing PHP's session_id to our auth script. How to do this?

In PHP there's a function called session_id() that returns the current session_id key. We will use this string as a user password for Ejabberd, so for example, using Strophe we can do something like this:


connection = new Strophe.Connection(BOSH_SERVICE);
connection.connect(+'@someserver', , onConnect);


Then Ejabberd will call our external authenticatinon script passing that nickname and the session_id as password. In our case we store the session information in Memcache, so our script will use the class SessionInspectorMemcache from InspectorD library. This class will connect to the session memcahe and from there will retrieve the session information belonging to that session_id. Finally it will return True or False depending if the user related to that session_id is authenticated or not.

You can see the complete authentication script here

If you are not using memcache to store the session information then you can create a Python class that extends from InspectorD's SessionInspector class and implements the getData method. You can see an example on the SessionInspectorMemcache class.

I hope this may result useful to you and don't hesitate to clone and improve InspectorD source code.

NOTE: I did a similar script using PHP but I found it somehow harder to implement than using InspectorD code. If you want to see that code, just ask in the comments and I will post it on github.

Thursday, January 21, 2010

Inspecting PHP sessions from Python

For one of our PHP projects we wanted to be able to inspect the PHP sessions from outside PHP. For example we want to know the users privileges at certain moment, i.e. if the user is logged in or not.

Why would you need that you may ask?

Well, let's say that our symfony application stores the result of a cached action in Memcache, having two versions of the result HTML, one for logged in users and one for logged out ones. In that case we want to avoid loading symfony at all and returning directly the HTML from Nginx. One of our devs wrote a Nginx module that does just that, it gets from the Memcache certain value, if it's found, then it returns the HTML immediately, else it calls symfony to handle the request. The problem with this approach is that the Nginx doesn't know if the user is authenticated or not, so it can't handle the case where we have two different versions of HTML output for one action. Well, until now...

Please welcome InspectorD a Python daemon that can inspect PHP sessions.

InspectorD is tcp server that understands a very simple text protocol: you ask it if certain session_id is authenticated and it replies 1 if it does, or 0 if it doesn't.

Here's an sample session:

telnet localhost 3002
isauth oglnp9phvn8ac04obdqjk6dko3
0
isauth bj6sc485t9s46o57qpngod5lm7
1
isauth bj6sc485t9s46o57qpngod5lm7 oglnp9phvn8ac04obdqjk6dko3 n63o4uk297c49131dcdg0h7g72
1
0
1
quit

The server is based on the Twisted framework and the PHPUnserialize module by Scott Hurring. From the later I fixed the session_decode method since it wasn't working for me.

For installation instructions and usage see the github project page.

Any comments and bugs reports are welcomed.

Tuesday, December 15, 2009

Erlang as Session Storage for PHP

In the last few days I been playing with the PHP extension mypeb which allows us to connect to Erlang from PHP. As a simple example to show what we can do with this extension I will create a PHP class that will be used as the session_save_hanlder for PHP. By deafult PHP stores the sessions in the file system, but if you want to share the sessions over several servers, then we have to resort to using a database or Memcached. I will like to try something different by using this class to interact with an Erlang node that will act as the in memory storage for our sessions using ETS tables.

To modify the session_save_hanlder we have to call the function session_set_save_handler and provide there six callbacks that will be used for the following actions: opening and closing a session, reading and writing to the session, destroying the sessions and garbage collect the sessions. You can read more about this function in the PHP manual.

Let's start by creating the open callback. In our example, we will have a method ErlangSessionHandler::open that will connect to the Erlang node.

public function open($save_path, $session_name)
{
if(null === $this->link)
{
$this->link = peb_connect($this->host, $this->erlang_cookie, $this->conn_timeout);
if(!$this->link)
{
throw new Exception(sprintf("Can't connect to the erlang node %s using erlang_cookie %s", $this->host, $this->erlang_cookie));
}
}
return $this->link;
}

There we use the function peb_connect that expects three parameters, the host to connect to, the Erlang secret cookie and an optional connection timeout. This function will return a resource identifier of the connection or false on failure. For our basic example we will define those three parameters as members of the class ErlangSessionHandler like this:

protected $host = 'server@127.0.0.1';
protected $erlang_cookie = 'ABCDEFGHI';
protected $conn_timeout = 5;

The method ErlangSessionHandler::close is very straightforward:

public function close()
{
if(is_resource($this->link))
{
peb_close($this->link);
}
}

When called it will close the connection to the Erlang node by calling: peb_close passing the resource identifier as parameter.

Then we have ErlangSessionHandler::read

public function read($session_id)
{
$x = peb_encode("[~s]", array(array($session_id)));
$result = peb_rpc("session_handler", "read", $x, $this->link);
$rs = peb_decode($result);
$data = $rs[0];
return is_array($data) ? '' : $data;
}

This method will be passed the $session_id which we will forward to the Erlang node. To accomplish that first we need to create an Erlang Message by calling the function peb_encode, which expects a format string and the value we want to encode into that format. In our case we need a list which will contain our session id as only element. Once we encoded the variable we will send it to Erlang by calling peb_rpc. This function works similar to the Erlang rpc:call function. We need to specify the Module and the Function to call as the first two parameters. The third parameter is the message we want to send, and the last parameter is the result identifier. This function will return the result of the RPC call or false on error. The session information will be the first element of the $rs variable. In case of an error in the Erlang side, $data will be an array instead of a string, that's why we return and empty string in that case. Take into account that the session read callback must return an empty string in the case that there is no session information for the provided id.

Now lets see the code for the ErlangSessionHandler::write method.

public function write($session_id, $session_data)
{
$x = peb_encode("[~s, ~s]", array(array($session_id, $session_data)));
$result = peb_rpc("session_handler", "write", $x, $this->link);
unset($result);
return true;
}

This method expects two parameters, the session id and the information to store. The code here is pretty similar to the one for ErlangSessionHandler::read. We encode the PHP variables as Erlang terms and we send them to the session server via peb_rpc.

Session destroy is also similar to the implementation of read, but we call peb_rpc("session_handler", "destroy", $x, $this->link); instead of "read":

public function destroy($session_id)
{
$x = peb_encode("[~s]", array(array($session_id)));
$result = peb_rpc("session_handler", "destroy", $x, $this->link);
unset($result);
return true;
}

The code for ErlangSessionHandler::gc is also simple:

public function gc($max_expire_time)
{
$x = peb_encode('[~i]', array(array($max_expirte_time)));
$result = peb_rpc("session_handler", "gc", $x, $this->link);
$rs = peb_decode($result);
return $rs;
}

Then to use the our class as session_handler we add this to our PHP code.

$sh = new ErlangSessionHandler();

session_set_save_handler(
array($sh,"open"),
array($sh,"close"),
array($sh,"read"),
array($sh,"write"),
array($sh,"destroy"),
array($sh,"gc")
);

session_start();

Then the final piece of the puzzle is to start the Erlang Session Server that is implemented in the file session_handler.erl.

$ erl -sname server
(server@localhost)1> c(session_handler).
(server@localhost)1> session_handler:start().

And that's it. We can start playing with our Erlang Session Storage Server.

NOTE:

First I want to make clear that this code is not meant to be used in production systems. Is just an example of what can be done with the mypeb extension.

I'm planning in writing a more robust session server in Erlang using the Mnesia database as a way to provide more reliable storage. With Mnesia we can easily distribute the session data across multiple servers, and in some of them store the sessions to disc.

Regarding the session save handler code, I would like to port it into the mypeb extension as native C code along with some php.ini settings that can provide the Erlang node to connect to, the secret cookie, connection timeout, etc.

As a final step I would like to do a small clean up to the API of the mypeb extension.

Tuesday, September 2, 2008

We started something!!

It seems that our benchmarking example has pushed people to do their useful benchmarks. 
You can check this framework benchmarks page and try to guess what are they actually benchmarking. If you can find the point of that benchmark, please drop some comments here, because I want to sleep with ease tonight.

So I want to left here a just a few remarks about this kind of stuff:
  1. Stop benchmarking your just created framework against symfony, Zend Framework, Cake, or whatever.
  2. When are we going to realize that the point of a framework is not to run as fast as assembly code, but to improve developer productivity and save money in developer time?
  3. I'm not pissed off, I just can't get the point of those benchmarks.  
If you have more examples of this kind of useless benchmarks, please add them to the comments.

Monday, September 1, 2008

Benchmarking die("Hello world!"); VS. exit("Hello world!"); VS. echo "Hello World!"; on PHP

After doing some useful and problem solving benchmarking with Siege we can scream to every corner of the world that exit() is faster than die() and echo

Show me the facts! I hear you screaming. Let there be facts!:

echo "Hello world!";
Transactions:         250 hits
Availability:      100.00 %
Elapsed time:        7.10 secs
Data transferred:        0.00 MB
Response time:        0.01 secs
Transaction rate:       35.21 trans/sec
Throughput:        0.00 MB/sec
Concurrency:        0.20
Successful transactions:         250
Failed transactions:           0
Longest transaction:        0.05
Shortest transaction:        0.00


die "Hello world!";
Transactions:         250 hits
Availability:      100.00 %
Elapsed time:       10.04 secs
Data transferred:        0.00 MB
Response time:        0.01 secs
Transaction rate:       24.90 trans/sec
Throughput:        0.00 MB/sec
Concurrency:        0.17
Successful transactions:         250
Failed transactions:           0
Longest transaction:        0.04
Shortest transaction:        0.00


exit "Hello world!";
Transactions:         250 hits
Availability:      100.00 %
Elapsed time:        6.05 secs
Data transferred:        0.00 MB
Response time:        0.01 secs
Transaction rate:       41.32 trans/sec
Throughput:        0.00 MB/sec
Concurrency:        0.27
Successful transactions:         250
Failed transactions:           0
Longest transaction:        0.04
Shortest transaction:        0.00


The previous test were performed simulating 25 concurrent users with a 10 times repetition. Now just imagine for one minute (or two if you need more) if 20.000 concurrent users hits your echo "Hello World" website! That will be a mess! Just by imagine myself this scenario I can't stop hearing the sirens on my mind, so please, grep through your code and preg_replace() all those echo "Hello world!" you may have there!