Re: DataBricks Foundational model rate limiting ap...

llmnerd · ‎11-12-2024

Hi there,

is this the correct approach to fulfil the rate limit restrictions in the foundational model API?

from langchain_core.rate_limiters import InMemoryRateLimiter

rate_limiter = InMemoryRateLimiter(
    requests_per_second=2.0, 
    check_every_n_seconds=0.5, 
    max_bucket_size=10
)

chat_model = ChatDatabricks(
            endpoint=model,
            temperature=temperature,
            max_tokens=max_tokens,
            rate_limiter=rate_limiter
        )

Alberto_Umana · ‎11-12-2024

Hello @llmnerd,

Yes, the approach you have outlined to fulfill the rate limit restrictions in the foundational model API using InMemoryRateLimiter from langchain_core appears to be correct. This setup should help you manage the rate limits effectively for your foundational model API. If you have any specific requirements or encounter any issues please let us know.

DataBricks Foundational model rate limiting approach