How WhatsApp Scaled to 50 Billion Messages a Day With Just 32 Engineers ?
Series
System Design Stories
Episode 2 • Season 1

How WhatsApp Scaled to 50 Billion Messages a Day With Just 32 Engineers ?

1.36KReads
12 October, 2023

WhatsApp is one of the most popular messaging apps in the world, with over 2 billion active users. It's hard to imagine that this massive platform was once run by just 32 engineers.

Key Engineering Techniques

So how did WhatsApp scale to handle 50 billion messages a day with such a small team? Here are a few of the key engineering techniques they used:

  • Single Responsibility Principle: WhatsApp focused on a single core feature: messaging. They didn't bother to build an advertising network or a social media platform. This allowed them to focus all of their resources on building the best messaging app possible.
  • Erlang Programming Language: Erlang is a programming language that is specifically designed for scalability and concurrency. It's used by many other large-scale distributed systems, such as Amazon Web Services and Ericsson.
  • Open Source and Third-Party Services: WhatsApp leveraged open source and third-party services whenever possible. For example, they built their messaging platform on top of the Erlang-based XMPP server ejabberd. They also used Google Push to provide push notifications.
  • Horizontal Scaling: WhatsApp scaled horizontally by adding more servers to their pool. This allowed them to distribute the load across multiple machines.
  • Vertical Scaling: WhatsApp also scaled vertically by increasing the capacity of their existing servers. This included adding more CPU, memory, and storage.
  • Diagonal Scaling: WhatsApp used a hybrid of horizontal and vertical scaling, which they called "diagonal scaling." This allowed them to scale more efficiently and cost-effectively.
  • Load Testing: WhatsApp regularly performed load testing to identify and fix any performance bottlenecks.
  • Monitoring: WhatsApp monitored their system closely to identify and resolve any problems quickly.

Other Engineering Techniques: In addition to the above techniques, WhatsApp also used a number of other engineering techniques to scale their system, including:

* **Asynchronous Programming:** WhatsApp used asynchronous programming to improve the performance of their system. Asynchronous programming allows multiple tasks to run at the same time, which can improve the overall throughput of the system.
* **Caching:** WhatsApp used caching to store frequently accessed data in memory. This allowed them to reduce the number of database queries that needed to be made, which improved the performance of the system.
* **Compression:** WhatsApp compressed data before sending it over the network. This reduced the bandwidth that was required to send and receive messages, which improved the performance of the system.

Engineering Culture

In addition to the technical factors mentioned above, WhatsApp's strong engineering culture also played a role in their ability to scale. WhatsApp engineers were focused on simplicity, reliability, and scalability. They also used continuous integration and continuous delivery to improve their software development process.

Simplicity: WhatsApp engineers emphasized simplicity in their design decisions. They believed that simple systems are more reliable and easier to scale. For example, WhatsApp's messaging protocol is very simple and straightforward. This makes it easy to implement and maintain.

Reliability: WhatsApp engineers were also focused on reliability. They knew that their system needed to be able to handle billions of messages a day without fail. To achieve this, they used a number of techniques, such as redundancy and load balancing.

Scalability: WhatsApp engineers designed their system to be scalable from the start. They used a number of techniques, such as horizontal scaling and diagonal scaling, to allow their system to handle increased load.

Continuous Integration and Continuous Delivery: WhatsApp engineers used continuous integration and continuous delivery (CI/CD) to improve their software development process. CI/CD is a set of practices that automates the building, testing, and deployment of software. This allows engineers to release new features and bug fixes more quickly and reliably.

Conclusion

WhatsApp's ability to scale to 50 billion messages a day with just 32 engineers is an impressive achievement. It is a testament to the company's strong engineering culture and its focus on simplicity, reliability, and scalability.

Additional Lessons

Here are some additional lessons that can be learned from WhatsApp's engineering practices:

  • Focus on a single core feature. WhatsApp's success shows that it is better to focus on a single core feature and do it well than to try to build a feature-rich platform that is difficult to maintain and scale.
  • Use the right programming language. Erlang is a well-suited language for building scalable and concurrent systems. If you are building a system that needs to handle a high volume of traffic, consider using Erlang or another language that is designed for scalability.
  • Leverage open source and third-party services. Don't reinvent the wheel. There are many open source and third-party services available that can help you to build and scale your system