Hi there,
is this the correct approach to fulfil the rate limit restrictions in the foundational model API?
from langchain_core.rate_limiters import InMemoryRateLimiter
rate_limiter = InMemoryRateLimiter(
requests_per_second=2.0,
check_every_n_seconds=0.5,
max_bucket_size=10
)
chat_model = ChatDatabricks(
endpoint=model,
temperature=temperature,
max_tokens=max_tokens,
rate_limiter=rate_limiter
)