Originally published on Code Red on May 25, 2016.
Our Data Addiction
On a typical Saturday afternoon in the height of the summer home-buying season, Redfin will send millions of listing update emails and push notifications to our users.
One of our biggest advantages is our ability to send these notifications within moments of an event appearing in the Multiple Listing Service (MLS) feed, the data source real estate agents use to put new homes on the market. In hot markets like San Francisco or Seattle, this timing can be critical: putting in a strong offer within hours of the initial listing can make the difference between securing a dream home and losing it in a bidding war.
Many of our customers (hundreds of thousands in fact) opt to receive daily digests rather than instant notifications throughout the day. And when these daily emails — unique to each user — are generated in the morning, our machines can be kept busy for hours trying to fetch all the required data to send them out.
Under extreme conditions, the implications of this excessive workload are twofold:
- Other jobs (listing importers, search indexers, etc.) can get backed up while the daily email job hogs resources.
- Users’ emails can get delayed, potentially hours after they would normally expect them.
In early 2015 — at a time when our email jobs relied almost entirely on batch-oriented operations — we began to observe exactly these two frightening scenarios....
Continue reading on Code Red....