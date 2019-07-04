Sanjay Mohan, Group CTO, Makemytrip.

India’a most popular travel booking company—MakeMyTrip—makes meaningful use of data, through automation, personalisation and data validation, using Artificial Intelligence and Machine Learning based on AWS Cloud. In a recent interview, Sanjay Mohan, Group CTO, MakeMyTrip, explains to Sudhir Chowdhary how a cloud native company like MakeMyTrip is winning with a ‘data first’ strategy. Excerpts:

At MakeMyTrip, how are you personalising the customer experience using data?

We are an online travel company and do thousands of transactions every day. We are trying to leverage the browsing history, location data, buying pattern, and other preferences, to personalise results for consumers. We ensure that everyone can see everything from personalised videos, different listing orders, a different set of recommended hotels, and flights. Our number one priority is to collect enough data for personalisation, which means we need computing capability, organisation capability, and a lot of Machine Learning (ML) and Artificial Intelligence (AI) to create a different experience for our customers.

We aim to understand what customers want to do, their intention when they come to MakeMyTrip, and then we take them through the entire experience. We have started deploying real ML models. AI and ML manage all lists and details that are seen on the site now, be it hotel ranking, discounting aspects, or customer service. Using these technologies, we personalise the pricing and sequencing. We are using ML and AI even in customer service, especially with chatbots, as we are investing in natural language understanding. We are recruiting data scientists to work on these projects. We are also using ML to train the existing engineering team, product managers, and the revenue team, because finding the right talent is difficult. There is a lot of hype, but it is a new domain and talent is in scarce supply.

Five years ago, we were thinking ‘mobile first’. Now, the entire nation needs to start thinking ‘data first’, and I believe this change will happen in the next 5-10 years. So overall, instilling the data culture in the company is a key priority for us.

What are the ground realities when it comes to being the CTO?

The ground reality is that everyone is collecting data, but 80% of that data is not useful. Every company that starts off new thinks it has a lot of data. But, it needs to check how much of that data is actually useful. It needs to make sure that the entire journey, from the point of collection to the point where it hits the data destination, is closely monitored in terms of data quality.

Do you mean to say that while everyone is collecting data, few are making meaningful use of it? Is this where you stand out?

Correct. We have invested a lot of time and effort into ensuring this. This is dirty work, and no one wants to do it—be it data validation, writing scripts, automating the entire collection, building a data dictionary, and flagging anomalies in the data. For instance, selling price can’t be less than buying price, or discounts cannot be more than the buying price. So, we ensure the entire syntax and semantics of data are being corrected along the entire pipeline. It takes a lot of discipline, and the kind of work data scientists do not want to do, but have to do.

Let us talk about cloud and AWS. What are the technologies you have deployed?

We are using the entire stack from AWS, but within the previously mentioned data themes. We are exploring a lot; our network is already using Amazon SageMaker. We have just started using a lot of Amazon Athena on top of the Amazon S3 data lake, and we send the cleaned data to a data warehouse called Amazon Redshift. The ML team pulls the consumer data in the standard format, and then it builds and trains its models. In terms of deployments, we are using Amazon ECS, and Amazon S3 for storage.

We have a lot of content in the form of UGC (user-generated content) and media. For example, we have to use a lot of pictures for hotels and holidays, and so storage becomes critical. We collect a lot of UGC and we need a lot of data crunching to extract even the surface-level dominant themes from the UGC—like whether a hotel is kid-friendly or has a swimming pool, the quality of food, or a hotel’s proximity to a railway station, and so on. We are using Amazon Athena and Amazon S3 for this.

At the macro level, what are the benefits as an organisation?

The biggest benefit of using AWS is that we have one less headache—we don’t have to manage a data centre. We don’t want our engineers to worry about infrastructure, network, or storage, which means they have the bandwidth to focus on things which are more focused on the customers, suppliers, and keeping us relevant to the market.

The other big thing that I realise in hindsight is that my data scientists are delighted now. Earlier they would ask for 20-25 servers for their experiments, but they wouldn’t really be able to procure more hardware each time they wanted to run an experiment as it is just not viable. That slowed down the rate of experimentation and innovation. If we want to put hundreds of models in production next year, we can’t do that with our own data centre. Now with Amazon Web Services, our data scientists want to go out and build their own experiments using AWS technologies. We are exploring some third-party technologies available on AWS, not from Amazon itself, but from the AWS ecosystem, to ensure faster stream processing.