Month: August 2015

An Ideal Summer Internship @ GoIbibo

Hello Everyone, It’s been almost 3 weeks since my internship at GoIbibo has ended and I thought what could be better than writing a blog to preserve  good memories?

So here it is, a full description of how I spent my summers hacking into some cool projects at GoIbibo!

Day 1:

I was excited to work in a company which had developed rapidly in a few years span! This was my first internship in a well-established start-up and I was hoping that I could make the best out of it. I was introduced to my mentor Bala Phani Chand (Principal Software Engineer at GoIbibo) by the HR. Phani told me that I would be a part of the control team which is responsible for Backend Development and consists of some of the brilliant minds of GoIbibo!I couldn’t wait to get started on my project! 😀

My First Project:

Before getting into the technical details of my project, let me ask you, how many times were you disappointed due to sudden change in price of flights when navigating from search result page to flight booking page? I bet you must have thought it is a technique of attracting customers but it is not! GoIbibo used to cache the results for a fixed short duration of time to avoid unnecessary hits to the API. So GoIbibo decided that it was time to fix this problem in such a way that the Customers are happy and they are happy too! 🙂  My job was to devise an intelligent algorithm which will reduce the no of such invalidations occurring specially during sales and some special occasions when the demand is high. Sounds interesting, doesn’t it?

The Birth of CacheBot:

I began researching on various factors which lead to increase in the no of invalidations and relation between these factors. I learnt machine learning algorithms as well as implemented them to predict the time for which the data is to be cached. The accuracy obtained was around 80% which wasn’t quite satisfactory. I then implemented multi-layer neural networks which failed to improve the accuracy and also handle cases when there is any sale. I used to constantly discuss the results of my research with Phani and Mr Neeraj Koul (Director of GoIbibo) and they would always help me figure out a better approach to solve the problem. After a whole month of research with various algorithms, I was successful in developing a Cachebot which determines the time for which the new data is to be cached based on past data results (obtained using machine learning algorithms) and current scenario analysis. The accuracy obtained was around 95% and the best part was it could handle sales as well as any sudden change in demand! 😀

GoIbibo 24 hours Hackathon:

GoIbibo conducted an awesome hackathon in which only its employees and interns were allowed to participate! We had access to unlimited food and drinks! 😀 All the teams coded the entire night and tried to compete with the other teams. I was amazed at the awesome products developed by each team in one night! Btw my team included Neha Jagadish , Pawan Sharma (Two of the finest backend developers from the Flights team) and Rupesh Singh (a talented JS developer). We developed a price tracker which instantly notifies the Customer when the prices of the flight he/she is tracking has changed! We managed to come up with a constant time solution for the same using some simple hacks on database signals! 🙂

Here is a group photo of all the GoIbibo developers who participated in the hackathon! 🙂 These guys were still bubbling with enthusiasm even after spending the whole night coding! 😀 Kudos to the awesome GoIbibo Devs! (Y)

goibibo

Project 2:

My second project was another interesting and challenging problem! 🙂 The aim of this project was to write a distributed cache library in Go which would store data in the form of key-value pair in-memory of each node and thereby reduce the latency which is generally encountered in Memcached or Redis. But you must be wondering why write it in Go instead of Python?  The major reason we used Go is because of its awesome concurrency feature and its channels which are greatly useful for reducing delay in networks. I was mentored by some of the most intelligent minds of GoIbibo – Mr Jyotiswarup Raiturkar who is the Head Architect of GoIbibo, Neeraj Koul and Harshad Patil (Senior Software Engineer).

GoCache:

This project required several discussions on the design and constant modification of the design! I encountered many challenging problems while implementing this library which I used to discuss with Mr Jyotiswarup, Mr Neeraj and Harshad, and they would help me figure out efficient solutions to those problems. Although I was an intern, I was free to put forward my ideas or challenge their ideas without having to think twice! 🙂

By the end of my internship, I had finally implemented a distributed peer to peer network wherein each node has its own BusSocket which can listen to peer nodes, dial to peer nodes, receive messages from peer nodes and evaluate them one at a time and also send messages to peer nodes concurrently. Akansha Gupta wrote an api for Ledisdb-Goleveldb cache which I used for maintaining a storage for each node. GoLeveldb is super fast, has expiry feature and allows concurrent reads and writes in a single process, and that is why we chose it for our library. I also implemented automatic redialling until connection is established,  when connection is lost and EOF is received on the connection. Each node could send acknowledgement to its peers after evaluating the message received. If a node fails to receive acknowledgement of the message it sent within the timeout period, it would break the connection and redial to the peer.  Harshad implemented stale data error acknowledgement which notifies the sender node that the data it sent is stale and updates the sender’s node database with the most recent data. We also implemented an optimal solution for handling cases wherein a node’s peer goes down for a while and by the time it comes up, its database is outdated.

On testing the code, we found that the network could send and simultaneously receive 1 million messages to and from its peers in 3-4 minutes. Yay! 🙂

My mentor Harshad has done a great job repainting the code, optimizing it and encapsulating it. The code has now been made open source and pushed to the GoIbibo github. Anyone and everyone is welcome to contribute to it. 🙂

Link to Harshad’s GoCache Library:  https://github.com/goibibo/stash

Link to my library which is still under development: https://github.com/elitalobo/DistributedCache

Overview of my internship experience:

I enjoyed every bit of it, so much that I wished I never had to return to college! 😀 I made really great friends and we had a tons of fun together! 🙂  The timings were flexible. I had all the freedom to voice my ideas and and they were always considered! And the best part of the internship is that I got to learn a lot from my projects and my awesome mentors which I doubt I could have learnt elsewhere! 🙂 Summer was well spent! 🙂

PS: The projects I have worked on haven’t yet gone into production as GoIbibo is rewriting the entire backend code in Go. 

Advertisements

My Internship Experience @GoIbibo – Part 2

Hi Everyone, This is the continuation of the previous post. In this post, I would give you a brief insight of my second project which was written in Go!

Project 2:

The goal of this project was to write a distributed cache library in Go which would store data in the form of key-value pair in-memory of each node and thereby reduce the latency which is generally encountered in Memcached or Redis. The major reason we used Go is because of its awesome concurrency feature and its channels which are greatly useful for reducing delay in networks. I was mentored by Mr Jyotiswarup Raiturkar who is the Head Architect of GoIbibo, Neeraj Kaur and Harshad Patil .

Week 1:

Learnt Go and networking in Go from various resources. I was then assigned my first task. I had to implement a basic peer to peer network which has the following functions.

1. Set key-value pair in the node’s own database and also send the key-value pair to its peers.

2. Get function which retrieves value of the key specified if it exists in the node’s database.

3. Delete function which deletes key from the node’s database and tells its peers to delete the same key.

Week 2:

Implemented the peer to peer network with the above functionalities. Akansha Gupta wrote an api for Ledisdb-Goleveldb cache which I used for maintaining a storage for each node.

Week 3:

Implemented a BusSocket for each node which can listen to peer nodes, dial to peer nodes, receive messages from peer nodes and evaluate them one at a time and also send messages to peer nodes concurrently.

Week 4:

Optimized the code and encapsulated it. Implemented automatic redialling until connection is established,  when connection is lost and EOF is received on the connection. Each node could send acknowledgement to its peers after evaluating the message received. If a node fails to receive acknowledgement of the message it sent within the timeout period, it would break the connection and redial to the peer. On testing the code, we found that the network could send and simultaneously receive 1 million messages to and from its peers in 3-4 minutes. 🙂  Harshad implemented stale data error acknowledgement which notifies the sender node that the data it sent is stale and updates the sender’s node database with the most recent data.

After returning to my college:

I handled the case wherein a node goes down for a while and there is a possibility that its database is outdated by the time is becomes active again. My mentor Harshad had implemented mqueues which basically stores the latest m messages (value of m can be configured by the user) to be sent to a peer in a queue (unique to each peer) if the peer is down. So when the peer is up again, the node will dial to it and start sending it messages from the queue. One major limitation of mqueue, was its size. The older messages get lost when the size of the queue exceeds m (maximum size).  I eliminated this limitation by maintaining a cache for each peer of the node. Each time the size of the queue exceeds its maximum size, the older data would be flushed into the cache maintained for that particular peer. Also we would set an expiry time over the messages flushed to the cache so that more than 1 week old messages will automatically be expired on the cache. We have used goleveldb as storage as it allows concurrent read and writes which makes it super fast. When the peer comes up again and dials to the node, the node will iterate over all the messages in the cache (messages which the node has missed) and send it to its peer. It would then try to send the messages from the queue if any, to its peer.

My mentor Harshad has done a great job repainting the code, optimizing it and encapsulating it. The code has now been made open source and pushed to the GoIbibo github. Anyone and everyone is welcome to contribute to it. 🙂

Link to Harshad’s GoCache Library:  https://github.com/goibibo/stash

Link to my library which is still under development: https://github.com/elitalobo/DistributedCache

My Internship Experience @ GoIbibo – Part 1

Hello everyone,

This summer I spent my summer hacking away on some cool projects at GoIbibo. So here is a detailed description of my first project in GoIbibo!

PROJECT 1:

Getting into the technical details of my project, the primary aim was to reduce the no of invalidations occurring when Customers find that the prices of the flights have changed on navigating from the search result page to booking page. These invalidations increase drastically during sales and it was my responsibility to come up with an intelligent algorithm that would handle sales and reduce the no of invalidations occurring everyday!

Week 1:

My mentor, Phani gave me the invalidation logs consisting of millions of invalidations and I began my research. I plotted graphs for each route to find a relation between various factors that might result in sudden increase of invalidations and  tried to find a way to use this data to resolve the problem.  I had a discussion with Mr Neeraj Kaul (Director of the Control team), on what I had derived from the graphs. He helped me figure out more factors which can also result in increased invalidations.

Week 2:

From the graphs, I was able to derive the following conclusions.                          1. There is a nearly linear relation between the time left for the journey date and the time it takes for the data obtained from a new invalidation to get invalidated.

2. Time taken for this data to expire is also inversely proportional to demand.

3. Demand generally increase on weekends but one could observe demand sometimes increasing on week days as well.

4. There is no relation between the price change and time taken for the new data to get invalidated.

I realised that this is purely a machine learning problem and hence started learning machine learning concepts from Andrew Ng’s Machine learning course and simultaneously implemented the algos to predict the time , a new data would take to get invalidated. I began with linear regression curves. I used scipy (a Python Machine learning library) to determine the coefficients of various linear regression curves and checked the accuracy of the predictions. I implemented gradient descent algorithm which would correct the coefficients of the curves depending on the mean square error of the predicted values. I was able to achieve 80-90% accuracy using these algorithms. :/

WEEK 3:

Still not satisfied with the performance, i decided to dive into neural network concepts and see if I can further improve the performance. The first few days of the week went in understanding multi-layer neural networks and implementing them. Basically neural networks is used to find a pattern between the attributes and the results when we are not sure about the pattern or relation between the attributes and the results, ie- It is used for unsupervised machine learning. In this case, we have several factors like demand which is interdependent on various factors like time before journey date, whether it is a weekday or weekend etc so Neural networks seemed to be the best approach to solve this problem.  I implemented single layer and multi-layer neural network with gradient descent for error correction. I tried varying the no of neurons and other constant values like learning rate etc but in vain, the best I could achieve was 80% accuracy. The performance kept changing each time since the weights between the neurons were initialized randomly. 😦

A few reasons why I chose not to go forward with the neural network are as follows:

1. Since the neural network is entirely responsible for generating a pattern from the past data, and the past data included sales data as well which couldn’t be filtered out, the neural network was generating a wrong pattern most of the times.

2.  It is difficult  to correct the pattern generated based on the errors because error correction required a large amount of training data and takes a lot of time (around 30 minutes).

3. The neural network took to much time to learn and it was required to store 360 coefficients for each route corresponding to each airline (We currently have 40,000 such combinations) which seemed to be an unnecessary waste of memory.

4. If the no of nodes in each layer is not optimal, there can errors arising due to over-fitting of the curve or under-fitting of the curve. It is not possible to figure out an optimal no of  nodes in each layer for 40,000 curves which may differ from one another due to various reasons.

5. The pattern generated from past data may be different from the current pattern due to reasons like sudden change in demand, sales etc. The neural network will generate a curve without considering the fact that time left before journey date is directly proportional to time it takes for new data to get expired and it may happen that the pattern generated does not follow this logic due to the errors brought in by the sales.

WEEK 4:

With the help of a fellow employee Akansha Verma, I started testing these results with some machine learning tools which has all the machine learning algorithms inbuilt ( just incase there is some fault with my implementation of the machine learning algos) and found that there was not much difference between the results.

I discussed this with Neeraj Sir and we finally came to a conclusion that we would use polynomial regression to learn from past data and divide the data into 3 or 4 categories to improve accuracy. Also we devised an algorithm which would handle sales based on current scenario analysis, independent of the machine learning algos. So basically we would determine true_expiry (time for which the data is to be stored in cache) from the polynomial regression curves, generated for the past data and logical expiry would be determined from the time it took for the most recent data to get expired. Expiry in redis cache would be set with respect to true_expiry. Whenever an invalidation occurs, we would check if the time taken for the most recent  data to get invalidated is way less than the calculated value. If yes, logical expiry would be half the time it took for recent data to get invalidated else logical expiry would be same as true_expiry. Logical expiry not equal to true_expiry  would generally mark the beginning of any sale or sudden increase in demand. This algo would ensure  that  the sales are handled properly. When a cached data is logically expired, it would hit the api to get the new price. If price is same as the data in the cache which has been logically expired, it will reset the logical expiry to the new true_expiry calculated from the current time left for the journey date to arrive. So when a sale ends, it would go back to using what it had learnt from past data.  With this we were able to achieve an accuracy of 95% and above  with no negative values of expiry time. In case by any chance if the above algorithms generate negative value of expiry time, it would recalculate using default values of coefficients of polynomial regression curves. Thus an intelligent cachebot was built which considers past data analysis as well as current scenario analysis and intelligently determines the time for which the new data is to be cached. 🙂

WEEK 5:

The whole week was spent testing the performance of the above devised algorithm and fixing bugs in code. 🙂