PHP7 is out. This isn’t news. It’s been out since last December, with nine minor revisions since then. What’s new is that it’s serving all of Wayfair’s customer-facing traffic.
Performance-wise, PHP7 is the rocket ship people said it would be. We’re nothing but pleased. If you can upgrade, you should do it yesterday. With barely any changes to our code, pages render in about half the time. CPU utilization dropped by about 30%.
Part of what made us nothing but pleased was that we haven’t had to roll anything back after flipping the switch. For context, Wayfair’s PHP presence is 3.5M LoC over 28K files. Coding conventions span several versions of PHP. There’s a similarly diverse range of third-party packages (composer and otherwise). We also use 66 PHP extensions – that is, C/C++ – including some forked third-party and some totally custom code. We serve 30M+ unique visitors a month from all around the world.
In a way, upgrading to PHP7 was less about the language and more about testing a complex codebase. Testing is a tricky thing with dynamically-typed languages. At the risk of stating the obvious, this means that every error will be a runtime error. This is great for fast development, but makes it difficult to ensure code “correctness.”
Here’s a brief guide to upgrading, as we experienced it. YMMV.
A first step was to get a lay of the land. Use PHP built-ins (phpinfo() or `php -i`) to find out which extensions you rely on and how they’re configured. Check out the official migration guide and the goPHP7 notes on extension compatibility.
With a decent mental map, you can grep your way to a punchlist of compatibility fixes. For example, knowing that the ereg extension has been dropped, apply /\b(ereg_replace|ereg|eregi_replace|eregi|split|spliti|sql_regcase)(/ to your codebase.
There are a few ways you can check your code without running it. The above grep example is, technically speaking, a form of static analysis. You can use `php -l` to check basic syntax. php7mar will pair that check with a group of others (mostly regex-based) for common migration gotchas.
If you want to be more robust, Phan and HHVM perform true static analysis on your code. (The Phan readme lists examples of what this includes.) It’s a tempting concept – marrying the rigor of strongly-typed languages with the agility of PHP – but it can be imperfect in practice. We used HHVM with PHP5.6 and found it difficult to maintain alongside our build process. We tested Phan and had some concerns about false positives and other noise on our aging codebase. That’s not to say either tool won’t have value for you, but that they require some careful evaluation.
Automated testing comes in many guises: unit tests, integration tests, browser tests. There are no universal truths in this world. So much depends on how your team works. But codified, reproducible tests can reveal subtle regressions that could impact your customers.
Use web logs and other information you already have about how people use your product. No unit test coverage? No problem! Take a day’s worth of URLs and curl a test server. Naturally, that means you’ll need to be able to detect when things go wrong.
Does your application already have a safety net? Logging and alerting are invaluable when preparing for large systems changes. (Really, they’re invaluable, period. If your application doesn’t have monitoring in place, stop reading this right now and set it up. Trees can fall in the woods without making a sound.)
Before you roll out to real traffic, try hammering a test server with ApacheBench. You might have an extension with memory allocation issues that aren’t revealed until you apply load in parallel.
Writing custom extensions is niche activity, but one that presents a significant testing burden. The internals of PHP have changed a bit – the slimmer data structures behind PHP variables are actually responsible for much of the speed gains in PHP7 – though there is code out there to help with some of the changes. A strong suite of .phpt tests, run with Valgrind, is invaluable. Code coverage tools, like gcov, can help validate the breadth and depth of your tests. A sandboxed development environment, like the php7dev vagrant box, can streamline your process.
Even if you’re not rolling your own, you may rely on pecl or other third-party extensions. You may need to isolate test cases to help guide others. Basic comfort with capturing and examining core dumps can go a long way here.
If you encounter issues, don’t forget about the existing PHP bug tracker. It’s a big community, so it’s not unlikely that someone else has already had your problem. Follow maillists. There’s a whole suite for PHP core. Third-party extensions and libraries may have their own. Familiarize yourself with the GitHub repositories for any third-party code. PHP7 is still relatively new. A bug fix you need might still be in an stable release.
Above all, remember that open source projects work when you give back to them. Share findings on maillists, file bug reports, submit patches for improvements.
When You’re Done
A big question is how to determine when you’ve testing something just enough. No method is totally comprehensive. Diminishing returns are a fact of life. We didn’t do anything particularly special in this department. We kept good notes, recorded questions as they arose, and ensured low-friction communication within the working group. At a certain point, you – the group you – will feel comfortable enough with the state of things to move to the next step: asking manual testers to have a look, beginning stress testing, exercising your deploy scripts, and, finally, sharing your work with the world.