DataBricks Foundational model rate limiting approach
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-12-2024 07:51 AM
Hi there,
is this the correct approach to fulfil the rate limit restrictions in the foundational model API?
from langchain_core.rate_limiters import InMemoryRateLimiter
rate_limiter = InMemoryRateLimiter(
requests_per_second=2.0,
check_every_n_seconds=0.5,
max_bucket_size=10
)
chat_model = ChatDatabricks(
endpoint=model,
temperature=temperature,
max_tokens=max_tokens,
rate_limiter=rate_limiter
)
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-12-2024 02:00 PM
Hello @llmnerd,
Yes, the approach you have outlined to fulfill the rate limit restrictions in the foundational model API using InMemoryRateLimiter from langchain_core appears to be correct. This setup should help you manage the rate limits effectively for your foundational model API. If you have any specific requirements or encounter any issues please let us know.