This post describes a recent architectural change made on my team at Wayfair. The Intelligent Systems Team, as we are called, is responsible for creating a fully personalized experience for millions of customers every day. One of the core products we own is the Recommendations Service. At its core, the service is a simple Python/Flask app that provides recommendations used on site, in emails, to power our proprietary display retargeting platform, and throughout various other applications.
An Object Relational Mapping (ORM) is a technology that translates relationship database schemas to an object-oriented model. The Recommendations Service utilized one to retrieve information from MSSQL server before storing that data in Redis. This blog post explains why we’ve discarded this approach in favor of directly dispatching SQL queries using SQLAlchemy Core.
The last few months have seen the Recommendations Service increase in both functional and architectural complexity. The core of this service is the ability to ingest a 32 character alphanumeric GUID per request which corresponds with a predetermined configuration. Having to adhere to a strict SLA, we try to hit the DB at most once per request, and only if the cache is empty.
With the most recent feature update, Configuration Sequences, our configuration query needed to model a One-To-Many-To-Many relationship. Executing this DB interaction through the existing SQLAlchemy ORM became too unwieldy, for multiple reasons. Others have written fairly extensively about their distaste for ORMs. Here are the main reasons why we’ve deprecated our ORM code.
Code Maintainability- ORM code can appear elegant when working with simple models. When that model contains multiple join tables and picklists, things get fairly complex. We ended up with over 100 lines of code that were tightly coupled to the database design, single line functions with half a dozen operators, and lack of clarity about how often the DB is queried. Not only did someone need to learn a new technology to be effective, but their success was often hindered by a lack of documentation. Directly dispatching SQL queries in Flask is fairly straightforward.
Do less work- It’s much more efficient for an engineer to build a SQL query once, than to have the ORM do that work for us every time a request is made. Additionally, the more complex the query, the more difficult it is to translate into ORM code. By altering this approach, we are enabling the service to scale more smoothly.
Don’t hide the good stuff- Data Literacy is one of our main tenets at Wayfair. If you’re an engineer, you should be able to “speak SQL”. The idea behind an ORM is to hide the SQL code, but it doesn’t- it just makes it more unintelligible. The most important parts of the query are abstracted away and hidden behind a foreign syntax. This is very bad- especially when you realize that your vital WITH (NOLOCK) hint doesn’t actually work. It’s very difficult to optimize a SQL query you can’t see, so why not just put it directly in your code?
In general, the service is now more pleasant to work with, and gives our engineers an increased sense of confidence when pushing code. This is due not only to the legibility of the code, but also the increased testability. Just another way we’re trying to shake the anti-patterns from our codebase.