Site Speed for eBay Search Results

First, welcome to the eBay technical blog! Each month, we will publish one or two entries describing technical challenges at eBay, and how we go about solving them. We look forward to your comments, and we welcome your suggestions for articles.

It’s my pleasure to write the first entry in our new blog. In two parts, I’m going to introduce you to how we’ve worked on site speed over the past year, and the results that has delivered. The bottom line is that improving site speed has helped our customers and driven our business. With improvements in site speed, sellers have sold more, buyers have bought more, and eBay’s business has grown as a result. Site speed matters, and we continue to drive improvements.

Defining Site Speed

A simple definition of site speed is “average latency from the time the user submits a request for a page until it’s rendered in their browser”. However, this isn’t easy to measure: we have to decide which users do we measure this for, and how do we go about the measurement? In our case, to realistically simulate what users see, we use a third-party service to measure latency, and ask them to measure the typical experience. They do this by fetching our pages from hundreds of locations in the US and in Europe. They’re able to provide us with measurements from the US backbone (the main trunk that connects the major telecom providers and ISPs), as well as measurements from the “last mile”, that is, close to the small ISPs who provide the service to most of our customers. In our case, we collect measurements every few minutes from hundreds of points, and our team looks at latencies, availability, and several different types of requests; we’ll talk about these in a moment.

The simple definition isn’t the most effective for several reasons.

Importantly, mean latencies hide many sins. Take a look at the fictional example in Figure 1, which shows the distribution of latencies for two different implementations of the same page. The x-axis is the user latency in seconds, and the y-axis is the number of customers (in thousands) who are seeing the page. The distribution of values is different between the red and blue lines: customers in the “red line” experience have latencies in the range of 2.5 to 5.5 seconds, and customers in the “blue line” experience have latencies in the range of approximately 1 to 7 seconds. But the mean average latency for both experiences is 4 seconds. The problem is that over a quarter of the page views in the “blue line” experience are slower than any page fetch in the “red line” experience – and, as you’ve probably observed yourself, it’s the worst case scenarios that leave the largest impression. (We’ve all had those pages that occasionally take much longer to load, seemingly hanging on one browser fetch. You’ve probably closed some of those windows or tabs, and gone somewhere else. Or you’ve hit the refresh button.)

Figure 1

So, what do we do to realistically measure the user experience? In our case, we track the 90^th and 95^th percentile latencies. This means we measure the mean latency of the worst 10% and 5% of our page requests respectively. In the fictional example in Figure 1, the 90^th percentile for the “red line” experience is just under 5 seconds, and the 95^th percentile is around 5.3 seconds. In the “blue line” experience, it’s around 6.5 seconds for the 90^th percentile and just under 7 seconds for the 95^th percentile. We’d therefore view the “red line” experience as substantially better (and, of course, we’d check it really was using a statistical test, such as a one-sided t-test). Another thing that’s great about measuring 90^th and 95^th percentile latencies is that they’re a great diagnostic tool – we pull apart the data we get, look for the bottlenecks, and fix those. By fixing them we not only improve the percentile latencies and make our customers happier, but we also substantially affect the mean average latencies.

Another metric we measure is availability. At the “last mile”, many things can go wrong between a user’s machine and our servers, such as network glitches, transient machine failures at ISPs, and so on. When you’re viewing our search results page at eBay, you’re making many requests to eBay servers in our data centers, and also requests to other providers who deliver advertising and other page components. If any one of these requests fail or timeout at the last mile, we count this as an availability issue for that page. As part of our site speed work, we track availability and work on improving it. Improvements include creating fewer opportunities to fail, working with our partners to improve their availability, and working on our services to improve them.

It’s also hard to agree on what “rendered in their [the users’] browser” really means. It’s easy to agree to start the timer when the user’s browser issues its first request. But it’s harder to agree when to stop the timer: is it when all network activity ceases? Is it when the browser fires its “onPageLoad” event? Is it when the page first becomes ready for user interaction? Is it when the visible area of the page (the “above the fold” area) is rendered completely? Is it when the components that most users interact with are rendered? In our case, we approximate a definition of “first ready for interaction”. Unfortunately, this can differ between browsers, and it’s a work in progress for us to measure this more granularly.

I’m a big fan of measuring as much as possible, and making informed decisions using all of the data that’s available. We therefore measure many other aspects of our site’s performance. One metric I love is Time-To-First-Item or TTFI. This is the time it takes a user from beginning a search session to visiting their first view item page on eBay. This is a fantastic, user-centric way to look at eBay site speed: how long does it take a customer who wants to buy something using our search engine to get to the first destination where they could buy? It not only captures real site speed, which involves users interacting with potentially many pages on eBay, but it also captures something about how good our search results are. If the user finds what they want at the top of the results page, the TTFI falls (that’s good!). If the site is faster, the TTFI falls. So, improving TTFI helps our customers, and helps us take a holistic view on the eBay experience.

Before we move on, it’s also important to note that there’s significant differences between what users see when content is cached in their browser or near them, and when the browser is starting from a “cold start” in fetching content. We track both of these, and look at the performance of page fetches for new and returning users. In most cases, returning users don’t re-fetch the static images and other static content, since it’s cached in their browser, and so their experience is typically much faster. We’ve observed that over 20% of our users are new users, that is, they don’t have any objects cached in their browser.

Looking at Site Speed

To give you an insight into how a browser interacts with eBay, take a look at Figure 2. It shows a waterfall of what objects a browser fetches when a new user loads the eBay search results page. I’ve produced this using the Fiddler2 web debugging proxy, hooked up to an instance of the Google Chrome browser running on my corporate machine at eBay.

Figure 2

Right now, new users make just over 100 requests to fetch the entire results page, and we’re reducing that number every month. Figure 2 shows you the first forty requests or so that are made to fetch the page shown in Figure 3. Notice that only i.html (our base page) is fetched when the session begins, and then the browser requests around six objects simultaneously as we make our way down the timeline. You’ll also see that most requests in this example are for 80.jpg, and each one is actually a different image thumbnail shown in the search results page. All up, for this query, 1 request is for the base page, 45 are for image thumbnails, 6 are for JavaScript files, 3 are for CSS files, 7 are for advertising assets, 18 are for static images, 8 are for merchandizing assets, and 1 is for tracking. (Note that the time on the x-axis in the Figure isn’t realistic, because we’re intercepting what the browser is doing using Fiddler and slowing down the experience. Figure 2 is therefore for visualizing and diagnosing performance, not measuring actual elapsed times.)

Figure 3

You’ve now seen how the browser interacts with eBay. In our next blog post I’ll talk about how to improve site speed. Until then, thanks for stopping by!

Hugh E. Williams
Vice President, Buyer Experience Development

Tag: Performance Engineering