Riding the StormAt weather.com, getting back to "the basics" is yielding the performance advantage it needs to build customer loyalty
By Justin Kestelyn When Mark Ryan sees a storm approaching over the horizon, he has more to worry about than getting wet or getting caught in a blizzard. As CTO of weather.com, the online counterpart to The Weather Channel, he has to consider whether a virtual flood of visitors will overwhelm his IT infrastructure - and if they will be able to enter the site, get the information they need, and then leave quickly enough without damage being done to a carefully cultivated "trust" relationship. With 14 million unique visitors, weather.com is the largest single-content Web site in the world. It has unique requirements: The content involved is dynamic, usage spikes are unpredictable, and visitors demand instantaneous access to weather information in a personal context, whether they're traveling, golfing, sailing, or just plain weather-watching. Indeed, as Ryan explains, performance and personalization are integral elements of trust architecture.
Before joining weather.com in October 1999, Ryan served as CTO at eBay Inc., where he learned a thing or two about the role of business-critical infrastructure in earning customer trust. A former IBMer, in 1996 Ryan designed and managed the IT infrastructure behind the Atlanta Olympic Games, the first such games with a strong Web presence. As you'll see, he has strong views on enforcing baseline IT principles, the industry trend toward open source, and the value of personalization in customer retention. IE: Weather.com is unusual in that your content and usage patterns are both extremely dynamic. Does that fact lead to unique scalability requirements? Ryan: Yes. Our timing for scale is opposite that of a standard e-business. Most companies scale over a period of weeks and months. We have to scale within several days to some pretty tremendous numbers: from four or five million page views per day to 19 to 22 million page views per day within 24 or 48 hours, with our high periods being the first quarter for winter storms and the third quarter for hurricanes. In contrast, most e-commerce shops scale during and across a single quarter so that there's more time to plan for increased traffic. If you're Land's End and you're going into your fourth quarter Christmas season, you can anticipate your increased rate of usage and then add capacity if appropriate; you don't get into a situation where you have to spike an additional 15 million page views in one day. IE: How has your infrastructure evolved to meet those requirements? Ryan: We've had to build in an extremely robust, scalable architecture. We started with an approach that was similar to other Internet startups that are growing at compounded growth rates. This approach is based on a self-explanatory strategy called "just throw hardware at it." That doesn't mean it's the right hardware, or that it's tuned for the application you're trying to run on it. It just means that you survive another day. After a couple of years of throwing hardware at our problems, we ended up with a hodgepodge of different systems tied together with "Band-Aid" code. We didn't have the ability to do any proactive tuning. All the production servers, software, and engineering change levels and release levels of the operating systems were different. IE: Sounds like a major headache. What did you do to address the problem? Ryan: The only thing we could do was start baselining our environment by running apps only on the most optimal platforms, and by making every piece of hardware on which we run those applications identical. We put the base disciplines of IT back in place: Whenever possible, make every piece of hardware identical. In other words, optimize the hardware for the application that you're trying to run on it, not the other way around. In our case, we're running very flat content that can be cached, so we really don't need a Sun E10000. Rather, we use fast, very inexpensive Linux boxes or offload our content to cache boxes across our infrastructure. IE: What led you in that particular direction? Ryan: A couple of things. First of all, one of the main criteria for earning customers' trust is to let them log on to your site, get the information they need, and then get off. You want to provide a combination of content they really need, and you want to give them the performance to access that content very quickly. Therefore, we wanted our architecture to scale very well and serve up flat content very quickly. Second, I believe that the industry - and the Web sites that have to move very quickly - is trending toward open source. Linux and Apache are very lightweight operating and Web serving elements, and as such, they're very fast. Frankly, this isn't brain surgery. It's all about base IT principles: Put the apps on the platforms on which they run best. IE: How do you execute that approach at weather.com? Ryan: I have two hosting facilities, which I try to make nearly identical. I put the applications that are more transactional in nature - the ones that need more robust serving capacity - on the appropriate platform; say, on Unix or something else slightly more robust than Linux. Then I make all of those servers identical so we can tune the application, server, Web server, and the IP stack all at once. System management is easy; we use round-robin or geographical load balancing to optimize capacity. We've also switched our maps and image serving from a Sun Solaris platform to a bunch of Netfinity Linux boxes running Apache. They're all identical, we tune them all the same, and we manage them all the same. By doing so, we reduce by an order of magnitude the time it takes us to serve up images. This horizontal approach lets us scale the end-user experience at some level of consistency. Previously, we were running anywhere from a 18-second page download to a 25-second download during peak season. Now, we're running at about 1.78 seconds very consistently. Even during hurricane season, when we have 15 million page views a day, we still run under two seconds per download.
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
| |||||||||||||||||||||||||||||||





















