February 6th 2010

HipHop for PHP: First look

Just this tuesday Facebook announced a ambitious project called “HipHop for PHP”, if you missed it general opinion says you have been coding PHP in a cave. As I write this review no code has been posted yet, but Facebook has made a great move to open source the project so we can all get our hands on it, use it and contribute to it. So since the code is not out there yet, this is literally a first impression article based on the presentation made by Facebook and various posts from core PHP developers who got a first look at the technology before the release.

What is it?

To be blunt, its a PHP to C++ code transformer (compiler). But that does not make justice to it, so let’s look deeper. To those of you that know PHP intimately you understand the process behind running PHP, it is thus:

PHP Code –interpreter–> OP CODE –Zend Engine–> Machine Language

Generally caching solutions store OP Code and reuse it instead of running the interpreter for every request. What HipHop does is completely different and surprised quite a few people who decided to guess what they were doing. On a general view this is the process (simplified):

PHP Code –parser–> C++ Code –g++–> Compiled binary

Historically PHP has always been executed on the Zend Engine, heart of PHP that has been around since PHP3, but what this solution does is that the Zend Engine has been recoded into the HipHop Runtime Engine, which instead of OP Code takes in C++ code that has been generated based on the original PHP code.

Why HipHop?

Its a well known fact that running code in C is faster then running PHP code, for obvious reasons, its very common for large applications in PHP to port part of its codebase to C and package it into an extension, such as Yahoo and even PHP projects like Doctrine have done so, performance of simple operations can increase in as much as hundreds percent, depending on load and usage.

This is the premise for Facebook’s project, they have long contributed to APC and PHP to get more performance out of their code, but with the increased load of billions of pages served it was not enough, they decided then to solve the problem. One of the options on the table was move on to another language all together, but this is where PHP shines, Facebook declared that PHP is simply a great solution because they can easily and rapidly get new programers up to status and developing in PHP due to its simplicity, that and the fact that their code base consists of million+ lines of code made them decide that this was not a solution, thus HipHop started.

How does it work?

The idea is that PHP code can be divided into “mundane” and “magic” code. Being mundane code basic operations that are directly mapped to C++ functions. This code if converted to C++ can be executed with much higher performance, while the magic code, which is the really complex code to be converted would run at equal or slightly lower speeds. This is the point that determines if you application can benefit from this, is it more mundane then magic?

If your answer is yes, then you may want to look into it. The converter does a lot of processing identifying dependencies, doing static analysis and other operations to get the basic code, it then has to take care of the problematic issue, Typing. PHP is a weakly typed language, meaning variables can juggle their types to and from various types. In the backend of this Zend Engine implements the ZVAL type, which basically stores anything. For the C++ code the new variables are typed so the parser needs to do all this in its Type Interface. The project’s lead Engineer, Haiting Zhao, stated that one of the solutions was to map ZVALs to the C++ Variant type whenever its impossible to determine a specific type (failed type inference), or when typecasting occurs in the process of the script. After all this analysis code is finally generated.

Thus this code is compiled against the HipHop Runtime, which as I said works like the Zend Engine and works now with specialized types instead of the abstract types in the Zend Engine. Binary in hand this can now be run straight from the command line, or interfacing with a web server as its compatible with the libevent library. Currently Facebook also wrote a very simple web server to interface with its compiled code replacing its Apache on calls to this code (as far as information goes, they proxy PHP traffic to this server and leave resources going through Apache).

The good and the bad

Good: This leaves programmers to continue coding in PHP, no slow downs, they can still have PHP’s ease of operation, code, run, see, fix, run, see, no need to re-compile and such. Compilation only happens to production code and unfortunately is a slow process. The final result is one large binary, a true binary that can be executed and it maps out to one process with multiple threads, which is interesting in other scaling topics like this mean you have one DB connection and not multiple.

Bad: Its compatible up to PHP 5.2, existing PHP extensions need to be converted to be compatible, compilation. With the markets overwhelming move to 5.3 and the incredible features present in it, having to fall back on 5.2 (earlier 5.2 versions, not latest) can really be a downside to the whole thing. Also, PHP extensions based in C and not thread safe need to be rewritten in C++ to be compatible, Facebook has converted a few, but their are lots of extensions out there and we might need to use more then a few. Compilation process is long so fixing a bug on a live production app is not as simple as fix, test, deploy, works; code must be recompiled and deployed, which is just fine if your QA processes are spotless, but in most cases you will run into delays due to compilation.

Not Supported? Some pieces of PHP are not and probably never will be supported, like eval(), create_function() and preg_replace using the /e flag. These functions won’t be missed if you like clean and quality code, but templating systems rely on it, like Smarty, so that’s not good news for them.

Result? Well Facebook has one advantage here, this is not an “experiment” or a theoretical project, its currently being used massively on their code base, so it works. Facebook stated reduction of 50% CPU usage on their servers, which is the equivalent of doubling your pool of servers, really impressive results.

What’s coming down the pipe? Well current plans include PHP 5.2.12 support followed by PHP 5.3 and support for running this inside Apache (mod_hiphop?). Timeframes on this are still undeclared.

Is HipHop for you?

From the various articles around the web, Terry Chay does a great job of helping you define if this if something you need to look into. In general I must say if you can run your application on 2 servers or less, keep going this is not for you. If you host or code apps that will live in Hosted Services, then this is still not for you, even though some providers like Server Groove already pointed out they intend to look into supporting it, its still shaky ground. Also if you application is more magic code then “mundane” code, you are still better off with PHP.

Conclusion

HipHop is an amazing concept and the complexity of it is enough to leave you in awe of the team responsible for it. It is definitively not a solution for most of the PHP-related market, apps and developers, most reviews I have seen state its not for 99,9% of code out there. I do think it will grow and evolve quite a bit once it is open to the community, its open source nature will be a generous boost and by far this has been one of the greatest moves by Facebook and something I really respect in their work.

I was quite refreshed to see a move of total innovation when all external medias placed their bets on a JIT compiler or re-write of the language. Its a solution that holds on to one of PHP’s advantages, its simplicity, and still brings a new point of performance gains to be explored by the community, it also puts some end to the various discussions on PHP and Performance, being able to generate performatic code that can be compared to the likes of Java and C#. In short it takes a scripting language and promotes it to machine code.

I will wait for the code to hit github so our team can dwell further into the inner workings and run it up against thinks like Zend Framework and Symfony, corner-stones of most applications out there, if it can’t support them its market space is restricted.

Other interesting topics to be watched is fragmentation, how will the PHP community react to this, compatibility will surely be a issue, some PHP features will not be supported in HipHop and vice-versa. Having this split can weaken the language, but if this is done in more of a “joint operation”, PHP will rise to new levels and embrace a greater audience.

1 Star2 Stars3 Stars4 Stars5 Stars (Sem votos registrados)
Loading ... Loading ...

5 Comments »

October 22nd 2008

ZendCon08 in review

This article was originally posted by me to the MIH/SWAT Blog, at SWAT at ZendCon 2008.

This year SWAT marked its first year of presence at the Zend/PHP Conference, also known as ZendCon. Neil Broers and I were chosen for the mission of going to ZendCon and bringing back all we could learn.

Instead of going into a day-by-day testimonial of the conference, which only makes sense during the event, and which I have already done on my personal blog (www.rafaeldohms.com.br/en), I will turn this post into an analysis of the events, the trends set at the conference, and what it will mean for PHP in 2009.

This year’s ZendCon had a simple sub-title, or motto, “High Impact PHP”. It makes reference to PHP’s participation in the Enterprise market, not just the impact on small companies and freelance programmers, but the impact on the big, high volume companies. This goes hand in hand with last year’s Call for Action in the opening keynote of ZendCon’07: Take PHP to the Enterprise!

Big Players like Orange UK, Zero9 and Bell/Textron have rolled out PHP solutions to deal with their high demand systems – some replacing Java, some building on top of it. And that’s what PHP is here for: to enable web interfaces for corporate systems. You don’t have to do it all in PHP, you can integrate your PHP code with other solutions, and create flexible and high performance web interfaces.

More than ever before we are seeing a buzz around PHP. Companies like IBM, Microsoft, Oracle and Adobe are announcing partnerships with Zend, and are adding their contributions to the PHP Core. This indicates a big change – instead of trying to crush PHP, or ignoring it, companies are integrating it into their products. It is good news for all the developers out there, who can now count on more reliable and faster drivers for database solutions and communication protocols (such as AMF for Flex).

PHP has reached the enterprise, just as we saw happening with MySQL. As was stated by Harold Goldberg in his “Call to action”, let’s not ask “Why PHP?” anymore, let’s ask: “How PHP? When PHP? Where PHP? Go and Evangelize PHP Deeper into the Organization”.

Zend contributed greatly to this move to the enterprise. During the course of the last few years we have seen Zend step up and give back to PHP. The Zend Framework is fast on its way to becoming an industry standard, the Zend Studio IDE is rapidly improving and raising the levels of productivity for developers around the world. Zend has to continue evangelizing PHP and offering tools to Enterprise customers, tools such as those included in the current Zend Platform.

The conference sessions mostly focused on the trend to Enterprise, as well as a few other trends. Of course we had a few vendor sessions – it was a Zend Conference after all. But the community really stepped up with interesting sessions. Some recurring trends we could see by just looking at the schedule were: Performance, Testing, integration with other languages and components, and the Zend Framework.

Performance talks are inevitable with the current movement into the Enterprise world. High end users inevitably means high demand and traffic. Various panels suggested strategies that went beyond just PHP, as scaling must be done on the whole project, of which PHP forms just a piece of the big picture. Unfortunately one hour sessions are just not enough and we did not get into mogileFs and other OS level discussions which would have been very interesting. In general the Database seems to be everyone’s bottleneck, and Jay Pipes gave an interesting session and tutorial on SQL optimization, based on MySQL Databases. Most if not all sessions on Performance were based on Case Studies: Mozilla, Ning, Oracle, Bell Canada.

The maturing of svn and its increasing use in PHP projects, distributed development teams, never-ending beta cycles and the rise of frameworks  leads to Testing and Code Analysis being white hot topics in any conference. ZendCon was no exception, with many “test” centred sessions. Worthy of note were the sessions by the eZ Components crew on Test Driven Development and Continuous integration, where the the need for svn versioning and unit tests were highlighted.

As the old saying goes “If you can’t beat them, join them”. In many situations PHP just won’t do, that’s obvious to us, so instead of forcing its way into those areas and beating other languages and systems, PHP has learnt to adapt and integrate with other Technologies. Since we are so focused on the enterprise, it makes sense that more and more we see sessions describing the use of PHP for integrating larger systems or foreign applications on the web. This is one point where all the partnerships begin to make sense, such as those with IBM and the fact that PHP is currently being used to take green-screen applications to the web, hence the support for i5 and DB2. Microsoft wants in as well, with support for better MSSQL handling and interaction with ASP.NET. Lastly we should not ignore Adobe’s Flex and the AMF protocol increasingly being supported by the Zend Framework.

Lately we have seen the Ruby language repeatedly coming up for discussion, especially with the “insurrection” surrounding the Ruby on Rails framework, which has to make you think about how much a framework can affect a programming language’s environment and penetration. Of late the Zend Framework is on the way to becoming an industry standard, even though as with Rails the phrase “use with parsimony” comes to mind. Many different sessions showed best practices and examples of where ZF makes developers’ lifes easier and lowers obstacles to productivity.

The atmosphere at the event was electric, many different companies came to show their products, such as github and parallels, but a good many were there to show off their work, and actually look for new employees. Zend’s Team was constantly available for questions. One topic where I found myself actually giving answers was on the Brazilian Open Source movement. Seems these markets are getting more and more attention from Zend and other companies, so we might see some good news in this area.

The massive presence of key PHP developers was amazing, and added a lot to the trend setting described above, as well as being present in a roundtable discussion on PHP 5.3, which gave users a chance to get a sense of what the next version of PHP will hold.

So what can we take from this? Zend is stepping up, the community is right in the middle of its crosshairs, PHP has taken a huge step forward.

And the future? 2009 is going to be an interesting year. Right after ZendCon, iBuildings announced the creation of a PHP Center of Expertise in the Netherlands, driven by Cal Evans himself, the man behind DevZone. This is just more proof that next year will be awesome for PHP, we will continue to see it mature with version 5.3 just around the corner and break even further into the Enterprise world and the “web 2.0” world (where it is already a huge player).

Ladies and Gentleman, this is the time for the PHP Community to come forth and continue taking PHP to the next level, don’t just develop in PHP, get active in the community, find your local user group, publish articles, and always: think outside the box.

1 Star2 Stars3 Stars4 Stars5 Stars (Sem votos registrados)
Loading ... Loading ...

No Comments yet »