Rate Limiting

This represents the maximum rate at which a requester can make requests of the service. i.e, how many requests can be made for a unit of time. We give a 429 for requests rejected due to a rate limit.

Note

DDoS attacks send requests from numerous clients which prevents solutions like just blocking the IP of the offending system. Given enough invalid requests, you can overwhelm a serivce.

In general, if every request to your service costs you, you need a way of controlling that cost.

Sequential IDs

You don’t want to use sequential IDs for things (like posts) due to security reasons. Someone could enumerate an ID and download a range of stuff.

Dealing with Limits

If a critical part of your system (3rd party API) is being rate limited, it is annoying. We need a way to deal with / get around this.

Do less work Just reduce the number of API calls that we need to make. There could be redundant calls you could eliminate.
Caching Just cache responses from an expensive API call, and use them instead of making another call (as long as the response is valid). Domain knowledge for invalidation is likely needed here.
Group Up Here, we group requests into one larger request. Instead of 5 separate requests getting different things, we get them all using one request if possible. This is easier on update. We might also be able to group unrelated requests or keep a buffer of requests read to go. Grouping requests makes handling errors difficult though. What if there is an issue with one element in the request buffer?
Patience Just wait for the rate limit to be up bro. Distribute your requests over time to stay under the rate limit. This ultimately means delaying requests to stay under the rate limit.

Agnostically, outside of a computer context, we deal with rate limits in the same way we deal with things where there is more demand for capacity. We Queue or line up for these things. If all requests go through a queue, this is a nice place to control flow and the rate of outgoing requests. Enqueueing a request is not always optimal because we are taking synchronous flows and are inherently making them asynchronous. This may not be suitable for some requests. If requests are not urgent, we could always just reschedule them to a less busy time.

Roll Persuasion Get the provider to increase the rate limit.

Exponential backoff works here also.

🤖 Dan Huynh

Recent Notes

Dan Huynh

Linearity

CAP Theorem

Causality

Quorum Reads and Writes

Explorer

Rate Limiting

Dealing with Limits

Graph View

Recent Notes

Dan Huynh

Linearity

CAP Theorem

Backlinks