Add wait time between embedding api call and completion api call to avoid hitting Rate limit

k31thchan · August 22, 2024, 10:15am

after I went through the setup steps in the tutorial I try out the chat with “rasa inspect” However I can’t get the response I expect in the web UI; it responds the following log in terminal

2024-08-22 18:07:45 INFO     openai  - message='OpenAI API response' path=https://my-id.azure.com/openai/deployments/text-embedding-ada-002/embeddings?api-version=2024-02-15-preview processing_ms=None request_id= response_code=200
2024-08-22 18:07:45 INFO     openai  - message='OpenAI API response' path=my-id.openai.azure.com/openai/deployments/gpt-4o/chat/completions?api-version=2024-02-15-preview processing_ms=None request_id=None response_code=429
2024-08-22 18:07:45 INFO     openai  - error_code=429 error_message='Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-02-15-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 86400 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.' error_param=None error_type=None message='OpenAI API error received' stream_error=False

Does anyone else hit such issues? Is it an expected issue? do you have any solution? I suspect the time between the embedding call and the Completion api call is too short. are there anyway to add wait between these call so that the system would not trigger such error?

Topic		Replies	Views
Hitting rate limiter on OpenAI api Rasa Pro CALM	0	21	September 5, 2024
S exceeded its OpenAI usage limits, causing problems durin rasa pro training (eventhough im using gemini api) Rasa Pro CALM	1	14	September 20, 2024
Rasa Api not answering after long training Rasa Open Source	2	255	September 18, 2020
ChatCompletion error when trying to use CALM with Azure OpenAi endpoint Rasa Pro CALM	3	153	March 22, 2024
Rasa Server + Front End with REST API Rasa Open Source	3	973	November 12, 2020

Add wait time between embedding api call and completion api call to avoid hitting Rate limit

Related Topics