Hi Everyone, This is the continuation of the previous post. In this post, I would give you a brief insight of my second project which was written in Go!
The goal of this project was to write a distributed cache library in Go which would store data in the form of key-value pair in-memory of each node and thereby reduce the latency which is generally encountered in Memcached or Redis. The major reason we used Go is because of its awesome concurrency feature and its channels which are greatly useful for reducing delay in networks. I was mentored by Mr Jyotiswarup Raiturkar who is the Head Architect of GoIbibo, Neeraj Kaur and Harshad Patil .
Learnt Go and networking in Go from various resources. I was then assigned my first task. I had to implement a basic peer to peer network which has the following functions.
1. Set key-value pair in the node’s own database and also send the key-value pair to its peers.
2. Get function which retrieves value of the key specified if it exists in the node’s database.
3. Delete function which deletes key from the node’s database and tells its peers to delete the same key.
Implemented the peer to peer network with the above functionalities. Akansha Gupta wrote an api for Ledisdb-Goleveldb cache which I used for maintaining a storage for each node.
Implemented a BusSocket for each node which can listen to peer nodes, dial to peer nodes, receive messages from peer nodes and evaluate them one at a time and also send messages to peer nodes concurrently.
Optimized the code and encapsulated it. Implemented automatic redialling until connection is established, when connection is lost and EOF is received on the connection. Each node could send acknowledgement to its peers after evaluating the message received. If a node fails to receive acknowledgement of the message it sent within the timeout period, it would break the connection and redial to the peer. On testing the code, we found that the network could send and simultaneously receive 1 million messages to and from its peers in 3-4 minutes. 🙂 Harshad implemented stale data error acknowledgement which notifies the sender node that the data it sent is stale and updates the sender’s node database with the most recent data.
After returning to my college:
I handled the case wherein a node goes down for a while and there is a possibility that its database is outdated by the time is becomes active again. My mentor Harshad had implemented mqueues which basically stores the latest m messages (value of m can be configured by the user) to be sent to a peer in a queue (unique to each peer) if the peer is down. So when the peer is up again, the node will dial to it and start sending it messages from the queue. One major limitation of mqueue, was its size. The older messages get lost when the size of the queue exceeds m (maximum size). I eliminated this limitation by maintaining a cache for each peer of the node. Each time the size of the queue exceeds its maximum size, the older data would be flushed into the cache maintained for that particular peer. Also we would set an expiry time over the messages flushed to the cache so that more than 1 week old messages will automatically be expired on the cache. We have used goleveldb as storage as it allows concurrent read and writes which makes it super fast. When the peer comes up again and dials to the node, the node will iterate over all the messages in the cache (messages which the node has missed) and send it to its peer. It would then try to send the messages from the queue if any, to its peer.
My mentor Harshad has done a great job repainting the code, optimizing it and encapsulating it. The code has now been made open source and pushed to the GoIbibo github. Anyone and everyone is welcome to contribute to it. 🙂
Link to Harshad’s GoCache Library: https://github.com/goibibo/stash
Link to my library which is still under development: https://github.com/elitalobo/DistributedCache