Pattern: Rate Limit

How can the API provider prevent API clients from excessive API usage?


The final version of this pattern is featured in our book Patterns for API Design: Simplifying Integration with Loosely Coupled Message Exchanges.

Pattern: Rate Limit

a.k.a. Quota, Usage Limitation

Context

An API endpoint and the API contract defining operations, messages, and data representations have been established. If required, an API Description has been defined that specifies messages exchange patterns and protocol. Clients of the API might have signed up with the provider and, if required, have agreed to the terms and conditions that govern the usage of the endpoint and operations. Alternatively, the offering might not require any contractual relation, e.g., when offered as an open government data service or during a trial period.

Problem

How can the API provider prevent API clients from excessive API usage?1

Forces

When preventing excessive API usage that may harm provider operations or other clients, solutions to the following design issues have to be found:

  • Economic aspects
  • Performance
  • Reliability
  • Impact and severity of risks of API abuse
  • Client awareness

Pattern forces are explained in depth in the book.

Solution

Introduce and enforce a Rate Limit to safeguard against API clients that overuse the API.

Sketch

A solution sketch for this pattern from pre-book times is:

Figure 1: Rate Limit: Once the client exceeds the allowed number of requests per time period, all further requests are declined.

Example

GitHub uses this pattern to control access to its RESTful HTTP API: Once a Rate Limit is exceeded, subsequent requests are answered with HTTP status code 429 Too Many Requests. To inform clients about the current state of each Rate Limits and to help clients manage their allowance of tokens, custom HTTP headers are sent with each rate-limited response.

The following code listing shows an excerpt of such a rate-limited response from the GitHub API. The API has a limit of 60 requests per hour, of which 59 remain:

GET https://api.github.com/users/misto       
HTTP/1.1 200 OK
...
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 59
X-RateLimit-Reset: 1498811560

The X-RateLimit-Reset indicates the time when the limit will be reset with a Unix timestamp2.

Are you missing implementation hints? Our papers publications provide them (for selected patterns).

Consequences

The resolution of pattern forces and other consequences are discussed in our book.

Known Uses

Rate Limits are implemented in many public Web APIs:

  • The GitHub API v3 has a 5000 requests per hour per user limit for authenticated requests. Clients can also make unauthenticated requests but these are limited to just 60 requests per hour (as can be seen in the example above). In the new GraphQL-based GitHub v4 API, the Rate Limit has become more sophisticated and takes into account the number of queried nodes.
  • Open Weather Map calls its rate limits access limitation and restricts clients to a certain amount of calls per minute, depending on the subscription.
  • Rate Limits in Quandl depend on the subscription level and also have a limit on the number of concurrent requests.
  • The Twitter REST API only allows authenticated clients and has Rate Limits divided into 15 minute intervals.
  • The Swiss Federal Administration’s registry of companies (“UID-Register”) has a public webservice API. The API is free to use but is limited to 20 requests per minute. If the limit is exceeded, a Request_limit_exceeded error is returned.
  • The Opedata.ch Transport API and the [timetable.search.ch API](https://timetable.search.ch/api/help APIs use the pattern too.
  • Many API Gateways, such as MuleSoft API Manager, allow developers to introduce Rate Limits. API gateways often also support throttling to further protect the exposed APIs.
  • The open Certificate Authority (CA) Let’s Encrypt limits the weekly number of certificates issued per registered domain, but also provides a renewal exemption. Its Automatic Certificate Management Environment (ACME) API also limits the number of accounts that can be registered by a given IP address every hour.

Some Web frameworks provide Rate Limit as an optional feature. For example, the Play-Guard library for the Java/Scala Play Framework provides a basic implementation.

More Information

Related Patterns

The details of a Rate Limit can be part of a Service Level Agreement. A Rate Limit can be dependent on the client’s subscription level, which is further described in the Pricing Plan pattern. In such cases the Rate Limit is used to enforce different billing levels of the Pricing Plan.

To observe individual clients and manage their allowances, the service provider needs to identify the client making a request. Therefore, clients need to present some form of identification (e.g. an API Key, an IP address or another authentication practice) so that the API provider can do the bookkeeping.

A Wish List and a Wish Template can help to ensure that data-bound Rate Limits are not violated.

The current state of the Rate Limit, e.g., how many requests remain in the current billing period, can be communicated via a Context Representation.

The systems management patterns published by Hohpe and Woolf (2003) can help to implement metering and can thus also be used as enforcement points. For example, a Control Bus can be used to increase or decrease certain limits dynamically at runtime.

As discussed above, Leaky Bucket Counter Hanmer (2007) offers a possible implementation variant for Rate Limit.

References

Hanmer, Robert. 2007. Patterns for Fault Tolerant Software. Wiley.
Hohpe, Gregor, and Bobby Woolf. 2003. Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions. Addison-Wesley.
Mirkovic, Jelena, and Peter Reiher. 2004. “A Taxonomy of DDoS Attack and DDoS Defense Mechanisms.” ACM SIGCOMM Computer Communication Review 34 (2): 39–53.


  1. What exactly is deemed excessive needs to be defined by the API provider. A flat rate subscription typically imposes different limitations than a free billing plan. See the Pricing Plan pattern for a detailed discussion of the trade-offs of different subscription models.↩︎

  2. Unix timestamps count the number of seconds since January 1st, 1970.↩︎