Lean like lichess

When you think about scaling an application it's easy to overcomplicate and build 20 services.

But there are examples of well crafted monoliths that serve millions of users with tens of thousands being active at any given moment.

Lichess (an open source competitor to chess.com) is mostly written by just one person, Thibault Duplessis (50k commits).

The core called lila is a single application that handles game logic and user interactions.

It also runs on just one beefy server.

Until a few years ago, that one server was also serving millions of users, but the CPU usage started to become a bottleneck because of it handling so many open websocket connections.

Now those websockets are spread out over many nginx servers that communicate with the lila engine through redis.

And even though now, caching, load balancing, database and security in addition to the single game server add to about 25 servers, they didn't scale to that with the first 10.000 users.

It took almost a decade and the optimizations were made only once a bottleneck appeared.

For me, stories like these are inspiration to build apps in a way that is simple, elegant, fast and with a gradual approach to scaling.

Scale when you have to, not before. That's easy to say, but harder to do.

Inspiration for this came from a talk I've seen here:

Lichess @ Big Techday 22: Serving 5 Million Chess Games a Day with 125 Volunteers and €5 Donations

Yours,

Taj