Friday, April 15, 2011

It's so quiet here why?

I forgot to mention that I've moved my blog here:

I'm also writing a book about RabbitMQ, check it out here:

Thanks for watching!

Tuesday, July 13, 2010

A note about OOP reusability

I was reading the first chapter of the book Practical Clojure, when I found this paragraph comparing OOP code vs. Functional Programming code:

"It [OOP] encourages a high degree of ceremony and code bloat. Simple functionality in Java can require several interdependent classes. Efforts to reduce close coupling through techniques like dependency injection involve even more unnecessary interfaces, configuration files, and code generation. Most of the bulk of a program is not actual program code, but defining elaborate structures to support it."

Which takes me back to my thoughts about all the verbosity of OOP code vs. say Haskell code. If we need so many things to make OOP code work or produce something useful, then are we sure this is the right programing paradigm?

Tuesday, May 18, 2010

About Haskell Static Typing

From time to time I re-read some random chapter of Real World Haskell. Last night I picked chapter two and found this interesting paragraph about static typing:

"A helpful analogy to understand the value of static typing is to look at it as putting pieces into a jigsaw puzzle. In Haskell, if a piece has the wrong shape, it simply won't fit. In a dynamically typed language, all the pieces are 1x1 squares and always fit, so you have to constantly examine the resulting picture and check (through testing) whether it's correct."

This pops in my mind all the is_a* tests that I've seen in PHP code.

Wednesday, April 21, 2010

A Word About Caching: Memcached and APC

Sometimes when talking with developers I see that there are some misconceptions regarding this two caching systems. That's why I'd like to share some concepts I've learned along the way.

What does APC do?

APC has two main features, one is to cache the PHP opcodes, to speed up the page delivery. This means that our PHP code doesn't have to be parsed every time it gets executed. The other one is to cache user variables, so you can cache the results of expensive_function() in APC, using apc_store, for example, and apc_fetch to get the value back without needing to execute the expensive_function() function again.

A common technique employed by frameworks like symfony is to parse some configuration files, like YAML ones, and the var_export them into a PHP file. Something like:

$some_config = array('value_a', 'value_b');

So later the framework just includes that file avoiding to parse the YAML file twice. This is technique is just file caching, but if you happen to have APC enabled, then you can benefit from the fact that the opcodes were cached by APC. So:

Misconception #1:

That APC caches the opcodes doesn't mean that loading whatever that file has won't be expensive performance wise. As an example, let's say that the framework caches a big array of configuration values. That array has to be loaded in memory again. AFAIK that can't be avoided. So, that APC caches the opcodes, doesn't mean that reloading it comes for free.

And what about Memcached?

Misconception #2:

Another misconception that I've heard is that Memcached is fast and it's speed is compared with hell's speed. That doesn't mean that is faster than accessing something from the PHP process memory, which is the case with APC. Every time you retrieve a value from Memcached, it has to perform a TCP roundtrip to get that value, besides opening a TCP connection to the server per request if you are not using persistent connections.

So while Memcached is fast, I won't recommend to use it for small values that are frequently accessed. APC can work just fine there –taking into account the size of the values of course–. So in the case of symfony, we use Memcached to cache view templates, results from queries, and things like that. But in the case of the routing generation calls, like the ones for url_for or link_to, I would prefer APC, which is what we use for the routing configuration.

Keep in mind that this is based on my experience, so take this for what it is and of course I'll love to read your comments about this topic here.

Monday, March 8, 2010

Erlang as a Fast Key Value Store for PHP

In this post I want to show you some of the neat things that can be done with the PHP-Erlang Bridge extension: A Key Value Store.

Erlang comes packed with a Key Value store in the form of the ETS module. This is database is pretty fast and efficient for storing the Erlang terms in memory.

I tried a proof of concept with the PHP extension and I obtained impressive results: Storing 150.000+ items in the ETS in 1 second! All that running on my Macbook Pro.

What I did was to write a PHP class wrapping the calls to the Erlang ETS module like for example:

public function insert($key, $value)
$x = peb_encode("[~a, {~a, ~s}]", array(array(
array($key, $value)
$result = peb_rpc("ets", "insert", $x, $this->link);
return peb_decode($result);
Maps to this call in Erlang:
> ets:insert(tablename, {key, value}).
You can see the full code example here.

So here are the steps:

- Install the PEB extension from source
- Start Erlang with this Command: erl -sname node -setcookie abc
- Create the ETS table: ets:new(test, [set, named_table, public]).
- Save the gist to a file and run it: php ets.php

So while this is a very simple proof of concept, I just wanted to illustrate some of the cool things that can be done with this extension. For example the speed of encoding/decoding from Erlang to PHP is pretty decent as well as the communication speed.

Please let me know your thoughts about it in the comments.

Tuesday, February 9, 2010

Meeting With Francesco Cesarini and the ECUG

Last week I was invited by the Erlang China User Group to meet Francesco Cesarini from Erlang Solutions who was in Shanghai. Since he's the author of the Erlang Programming book by O'Reilly this was an amazing opportunity to learn more about Erlang from someone with real world experience in the field.

The guys from the ECUG picked up a nice Chinese restaurant where we shared our experience about Erlang from our several points of view.

I had my share of questions about topics such as Riak, Mnesia, RabbitMQ, Ejabberd and what not. It was nice to learn how big the Erlang world is in the enterprise, how is it used for serious matters such as banking, item traceability, and of course all the other features of Erlang, like reliability, performance, etc.

One important topic in which we all agreed was how the language gap between English and Chinese produces two phenomena that can slow down Erlang in becoming popular in China and at the same time keeps the rest of the world unaware of what Chinese companies are doing with Erlang.

Luckly this will start to change since the Erlang Programming book is about to be released in Chinese and the guys from the ECUG are translating some of Erlang documentation to Chinese.

Besides that I plan to give my 2 cents by writting some blog posts in english about the Erlang movement here in China. Also we have talked with the guys from the ECUG to have some Conferences about RabbitMQ and other Erlang products that we use at the company I work for.

To end my post I'd like to share a couple of pictures from our meeting:

Xihe Yu with Francesco Cesarini

The gang with Francesco Cesarini

Monday, February 1, 2010

Sharing Sessions Between PHP and Ejabberd

In my last post I wrote about a pet project I started to share sessions between PHP an Python. In this post I want to show you how we can share the sessions between PHP and Ejabberd.

So here's the problem. In one of the projects where we want to use XMPP we have a users databasase of around 2.5 millions users. We want that those users are able to login to our Ejabberd server using the same database. This means that every time a user logs into our site, we will query the database with PHP to see if he's allowed to login, and then ejabberd will query the
database again for the same purpose. Now, since the user is already authenticated in our PHP app, why don't we just share the session information with Ejabberd? Here's where InspectorD comes into play.

The first piece that we will use to solve this problem are Ejabberd external authentication scripts. In our case instead of authenticating against a database, we will user InspectorD to check whether a user is authenticated in our website. To do this we need to find some means of passing PHP's session_id to our auth script. How to do this?

In PHP there's a function called session_id() that returns the current session_id key. We will use this string as a user password for Ejabberd, so for example, using Strophe we can do something like this:

connection = new Strophe.Connection(BOSH_SERVICE);
connection.connect(+'@someserver', , onConnect);

Then Ejabberd will call our external authenticatinon script passing that nickname and the session_id as password. In our case we store the session information in Memcache, so our script will use the class SessionInspectorMemcache from InspectorD library. This class will connect to the session memcahe and from there will retrieve the session information belonging to that session_id. Finally it will return True or False depending if the user related to that session_id is authenticated or not.

You can see the complete authentication script here

If you are not using memcache to store the session information then you can create a Python class that extends from InspectorD's SessionInspector class and implements the getData method. You can see an example on the SessionInspectorMemcache class.

I hope this may result useful to you and don't hesitate to clone and improve InspectorD source code.

NOTE: I did a similar script using PHP but I found it somehow harder to implement than using InspectorD code. If you want to see that code, just ask in the comments and I will post it on github.