Historical user click patterns on search result pages are considered a great resource for algorithms that attempt to learn to rank search results. This ranking method is a well-studied problem in the field of Information Retrieval (IR). See Mike Mathiesonʼs excellent blog post, Using Behavioral Data to Improve Search, for more background on this subject.
If learn-to-rank algorithms could explicitly observe which results were deemed relevant by most users and which were not, we would have open access to usersʼ minds. Alas, we live in the real world where only user clicks can be observed; their thought patterns are forever hidden, for us to infer and speculate upon.
What are click models?
Click models are mathematical models that attempt to do just that: Describe a typical userʼs decision process as he or she interacts with the search results page, so that we may infer said userʼs judgments on the relevance and irrelevance of specific search results.
Take, for example, the following scenario:
The user searched for “Meaning of life”. A search result page (SRP) with 50 results was served back. The user then clicked on result #2, and we never heard from him again.
Consider the following two explanations:
- The user looked at the SRP, read the snippet returned for result #1, then ignored it as irrelevant. The user then moved to result #2, read the snippet, found it attractive, clicked through to the page, found the meaning of life in there, then stopped the search, satisfied with what he found.
- The user glanced at the SRP, chose result #2 randomly, read the snippet, found it somewhat relevant, clicked-through to the page, and found it completely irrelevant. Then his phone rang and he abandoned the search.
According to the first explanation, result #2 was relevant to the userʼs search. According to the second explanation, it wasnʼt. Both explanations (and many others) are possible. But are they equally likely?
A click model helps us assign mathematical probabilities to every such explanation, enabling us to use millions of scenarios to infer the likelihood of relevance for every search result against every search.
Commonly Used Click Models
The simplest class of click models is called position models. Position models assume that search results have a probability of being examined by the user that decays with the position of the result within the page. A click depends on a result being examined and deemed relevant, so that P(click) = P(examined) * P(relevant), and P(examined) is a decaying function of rank.
Cascade models are another class of click models, where the user is assumed to examine the results sequentially: starting from the top, clicking on the first relevant result examined, and stopping the search immediately. Here the probability of a click depends on the relevance of a result, as well as the irrelevance of all previous results. This model doesnʼt account for abandoned searches, or searches with multiple clicks.
Recent work has shown that using a dynamic bayesian network (DBN) to model general web search clicks outperforms both position models and cascade models [Olivier Chapelle, A dynamic bayesian network click model for web search ranking, Proceedings of the 18th International World Wide Web Conference (WWW), 2009]. The DBN model assumes that users examine items in an SRP starting at position 1 and working downwards: skipping items that appear irrelevant and clicking on items that appear relevant, until either abandoning the search or landing on a satisfactory page following a click from the SRP. The user is assumed to leave the search as soon as the first satisfactory page is presented. SRP clicks are observed in the logs, but all other variables in the bayesian network are hidden.
Figure 1 below illustrates the concept. The boolean variables inside the box relate to user behavior, whether observable or not, and the boolean variables below the box relate to intrinsic properties of URLs: au denotes how click-attractive a URL snippet is, and su denotes how satisfactory the page really is once you land on it following a search.
Here, clicks are the only observable events. All other events are to be inferred using expectation maximization (EM) guided by the causality links that the model assumes between boolean variables. The causality links are denoted in the figure by arrows.
A parameter of the model is the user persistence rate γ, which is the probability that a user would continue to examine results after landing on an irrelevant page.
Why We Need Something New
DBN and other web search click models are not well suited for explaining user behavior at eCommerce sites such as eBay. Compared to users of general web search, shoppers at eBay are unlikely to be satisfied by landing the first relevant result. Shopping and web browsing are radically different IR exercises. For shopping, the items in the SRP are competing for the shopperʼs business, and the shopper is in a process of arbitration, where he/she must choose to award business to zero or one winning item.
Unlike web surfers, web shoppers (think referees) are expected to sample seemingly relevant items until they feel satisfied that they have come across a representative sample of good deals as well as bad deals. Only then can a shopper make an informed decision that an item is a winner. In this setting, a good deal is only good relative to other deals that are not as good. For example, if the top ten results are equally good deals then weʼd expect the shopper to keep scrolling down until a satisfactory number of bad deals are encountered. That is in sharp contrast to the general web browsing behavior assumed in DBN.
Proposed Click Model
The following model builds upon DBN, with a focus on eCommerce user behavior.
In this model, SRPs list items on sale rather than webpage URLs. Every listing L is assumed to have two independent intrinsic properties: aL and gL. The aL property denotes how attractive the listing looks in the SRP, which influences clicks; and gL denotes how much of a good shopping deal the listing really is, which depends on the price, item details, seller details, shipping details, etc. Both variables are latent – that is, cannot be observed directly; however, they are assumed to influence the behavior of users on the site, and therefore can be inferred (hopefully) by analyzing mountains of click data.
Notice how the model still contains Ei (examine) events, and how, just like the DBN model, the user is assumed to examine results sequentially top to bottom. The proposed model also contains Ai (attracted enough to click) events; but Si (satisfied) events are replaced with Gi (good deal) events, which denote that the user found the listing to be a “good deal” over all. We introduce a new event S (success event) to denote a successful shopping experience. At eBay, an S event translates into a bid or buy-it-now action. Because S events are directly observable, this model enables learning from clicks, bids, and purchases simultaneously.
Upon finding a good deal, the user performs an S event with probability f(G), which depends not only on the goodness of the current listing, but on the goodness of all previously examined listings. This is how the model accounts for users browsing multiple good and bad deals before arriving at an S decision. In the simplest case, f(G) can be the sum of good deals encountered, but can be generalized to the form:
f(G) = αΣ(good deals) + βΣ(bad deals)
Where α and β are model parameters.
This modeling framework, if properly applied to eCommerce click logs, would greatly enhance the learnability of ranking algorithms from click data.