So when a web service is getting too much traffic it starts returning the 503 status code. Well written services also return the Retry-After header hinting the client when it should come back again. Good behaving clients then respect that or will back-off by themselves to make sure the server is not getting too much traffic. However this is not enough if there are bad behaving clients in the mix. And how do you identify the bad behaving clients?
Well historically people typically try to identify bad behaving clients based on IP or even identity of the caller. And doing it based on identity might be expensive for the server. There is however a technique that has been around for at least 15 years but is rarely used; client puzzles.
I client puzzle is a type of proof-of-work where the server gives the client a puzzle and refuses to serve the client before the puzzle is solved. The way it works in practice is that under normal operation the server will not require a client puzzle to be solved. But once the server detects a suspected DoS attack it starts returning 503s to the clients with a client puzzle. A client puzzle is typically something that is CPU heavy and takes a significant amount of time to solve but that is easy to verify for the server.
The simplest form of client puzzle is a hash inversion. The server picks a secret S of length K bits and hashes the secret together with random salt. The hash, salt and the length of the secret is given to the client as a challenge. The client then must find what secret produces the given hash which results in on average 2^(K-1) hash calculations. Once the client finds the secret it is presented to the server.
Any bad behaving clients can easily be identified since they are either not presenting the solution in a timely manner nor are they presenting the correct answer. So even if a malicious client is trying to guess the secret the server should be able to easily and cheaply refuse access. Especially if each failed attempt results in a new puzzle being required and potentially an even harder puzzle (a larger K).
Naturally there is an interesting twist if we are considering the service to be a distributed service. Since this is a counter measure for DoS attacks we don't want to keep connections open while the puzzle is being solved so we have no guarantee that solution is reaching the same server as the one initiating the request. I think there are two practical ways to solve this.
Either you keep solutions for issued requests in a distributed in-memory cache. Yes it means each guess needs to be verified most likely with a network hop, but it might be worth it. The other option is that a number of secrets are distributed to all servers and they each pick a secret from this collection randomly. This is a variant of the one-time-pad and will ensure each secret is only verified one per server at the expense of some clients having to redo the puzzle because the server they talk to considers the secret used. This can however be minimized by having a large enough collection of secrets that is replaced often enough that it is replaced when it is far from depleted.
Maybe it is the problems involved in distributed systems that never made client puzzles more popular... What do you think?