How to avoid openAI rate limit errors, Retrying with exponential backoff

One easy way to avoid rate limit errors when using openai chatGPT or GPT4 api calls, or any API clals, is to automatically retry requests with a random exponential backoff. Retrying with exponential backoff means performing a short sleep when a rate limit error is hit, then retrying the unsuccessful request. If the request is still unsuccessful, the sleep length is increased and the process is repeated. This continues until the request is successful or until a maximum number of retries is reached.

This approach has many benefits:

Automatic retries means you can recover from rate limit errors without crashes or missing data
Exponential backoff means that your first retries can be tried quickly, while still benefiting from longer delays if your first few retries fail
Adding random jitter to the delay helps retries from all hitting at the same time
Note that unsuccessful requests contribute to your per-minute limit, so continuously resending a request won’t work.

Below are a few example solutions.

Example #1: Using the Tenacity library

Tenacity is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just about anything.

To add exponential backoff to your requests, you can use the tenacity.retry decorator. The following example uses the tenacity.wait_random_exponential function to add random exponential backoff to a request.

import openai  # for OpenAI API calls
from tenacity import (
) # for exponential backoff

@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(6))
def completion_with_backoff(**kwargs):
return openai.Completion.create(**kwargs)

completion_with_backoff(model="text-davinci-002", prompt="Once upon a time,")

Example #2: Using the backoff library

Another library that provides function decorators for backoff and retry is backoff.

import backoff  # for exponential backoff
import openai # for OpenAI API calls

@backoff.on_exception(backoff.expo, openai.error.RateLimitError)
def completions_with_backoff(**kwargs):
return openai.Completion.create(**kwargs)

completions_with_backoff(model="text-davinci-002", prompt="Once upon a time,")

more reference with OpenAI

openai example

Author: robot learner
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source robot learner !