For every advertisement that Yahoo! pops up alongside the content on a website, there?s an India connection as to how it got there. Over the past year, the internet company?s research and development arm in Bangalore has been running a key component of its advertising exchange ? a prediction algorithm that figures out which advertisement best fits the online content to earn a click.
Put simply, an algorithm is a set of instructions for a calculation. Predict, as Yahoo!?s response algorithm is called, is among a set of core innovations driven by the India R&D centre, the company?s largest outside the US. It is aimed at getting advertisers higher returns on their ad-spend and better yields for publishers, besides a better experience for users, says Shouvick Mukherjee, vice-president and CEO Yahoo! India R&D. ?If you look at it, it?s a pretty complex optimisation because, in the case of text, it?s much simpler. But when it becomes a digital display advertising, it becomes a very complex problem.?
Predict is a part of Right Media Exchange, the display advertising exchange platform that Yahoo acquired in 2007. Its algorithms run on a Hadoop cloud cluster grid and are deployed in thousands of live advertisement servers across the world. They get called to return the click probability about one trillion times each day, with the response time of a few milliseconds, according to the company. Hadoop is an open-source cloud computing framework.
Typically, Predict uses machine learning to choose from several thousand advertisements in real time, which also makes it a bit different from key-word searches. ?For keywords and text, it?s a matching algorithm; here, it is a learning algorithm,? says Mukherjee.
Machine learning refers to developing models based on the analysis of data as against the classical rule-based algorithms. In Yahoo?s case, this would mean studying samples of the usage patterns of the 700 million users the company claims globally. ?Then, you have a feedback loop, so you always send a feedback that this was your training set and this is what you learnt,? says Mukherjee, adding that the higher prediction accuracy has led to improvement in revenue per impression, or views by users, over the past year.
Rule-based algorithms work well if the rules capture the intricacies of the prediction problem, but learning appropriate rules from large data-sets remain a challenging research problem, says Chiranjib Bhattacharyya, associate professor, Computer Science and Automation department, Indian Institute
of Science (IISc).
?Present-day algorithms for rule learning do not generate high-quality rules, which often leads to poor performance. Machine Learning-based systems are not dependent on specific rules, but tries to capture the statistics of data,? he says. Accuracy is measured by evaluating the number of mistakes the prediction algorithm makes on a random test set.
Internet advertising in India is projected to grow by 25.5% over the next five years to touch an estimated R2,400 crore in 2015 from R770 crore in 2010, according to the PwC Entertainment Media Outlook Report 2011 released last July. India?s total advertising market, dominated by television and print, was estimated at R24,750 crore in 2010, growing at 11.4% over the same period.
?Internet advertising is the fastest growing segment compared with all other segments. The Indian growth numbers are in line with the global trend,? says Smita Jha, associate director, Entertainment and Media, PwC India. Globally, the growth in internet advertising is estimated at 13%, while the global entertainment and media market is growing at 6%.
Global internet advertising is projected to reach $130 billion by 2015, by when the overall global advertising is estimated to touch $578 billion. While search used to be the bigger component of internet advertising globally, the trend is for other segments, such as display and classifieds to gain more share. ?Currently, it is about 50-50 in India. The trend is to move to display, classifieds and others which are more targeted,? says Jha.
Besides Predict, the launches from Yahoo! India R&D during 2011 also included PolyAds, a format that allows display of multiple advertisements in a single slot with a 3D-cube design. The Bangalore centre also handled a redesigned image search facility, which, the company claims, has helped increase page views by 25% in the US and over 58% across other countries and it has also built a new tool that allows users to have private conversations while surfing content.