DataBricks Foundational model rate limiting approa...

llmnerd · ‎11-12-2024

Hi there,

is this the correct approach to fulfil the rate limit restrictions in the foundational model API?

from langchain_core.rate_limiters import InMemoryRateLimiter

rate_limiter = InMemoryRateLimiter(
    requests_per_second=2.0, 
    check_every_n_seconds=0.5, 
    max_bucket_size=10
)

chat_model = ChatDatabricks(
            endpoint=model,
            temperature=temperature,
            max_tokens=max_tokens,
            rate_limiter=rate_limiter
        )

DataBricks Foundational model rate limiting approach