โ08-11-2022 06:31 PM
I have a list of jobs that are using the code in GitHub as source.
Everything worked fine until yesterday. Yesterday, I noticed that all the job that were using GitHub as source were failing. Because of the following error:
```
Run result unavailable: job failed with error message
Checkout remote repository: INTERNAL_ERROR: Failed to checkout internal repo. This workspace already has 9253 repos which exceeds the max limit of 5000 repos
```
However, I checked the Repos folder in my workspace, and there are only <100 of repos. No idea why databricks claimed that I have 9k of repos.
And the number now is >10k. And I didn't created hundreds of new repo in last 24h.
I believe it is databricks issue. What should I do to resolve the issue?
Thanks,
FYI, now I have changed the source of notebook to local repo, and my jobs are running now.
โ08-17-2022 12:39 AM
Just an update, to round this out.
We investigated further internally, and found that although we have a cleanup process in place to remove the internal repos that are being checked out for workflows, it was failing to catch up due to the sheer volume of jobs that were continuously failing during the repo checkout step (because of an invalid path).
This led to the limits being breached, and cascaded down to valid jobs not being able to launch.
We've worked with Kit to identify the errant job(s), and are now closely monitoring internal metrics, which currently show significant improvements.
โ08-16-2022 02:06 AM
Hi, @Kit Yam Tseโ -- indeed, internally, we count the number of repos in the workspace, and 9253 repos seem high. Can you use the Repos API to get the actual number? (You may need to use `next_page_token`.)
As an example, I use the following Python function to count the number of repos in my workspace. You can modify it to your needs.
def call_endpoint(self, endpoint, response_key, params=None, pagination_key=None):
url = f"https://{self.api_host}/{endpoint}"
response_length = 0
start_time = time.time()
if pagination_key:
if pagination_key == 'next_page_token':
try:
response = self.session.get(url, headers=self.api_headers, params=params)
response_length = len(response.json()[response_key])
while 'next_page_token' in response.json():
params = {
'next_page_token': response.json()['next_page_token']
}
response = self.session.get(url, headers=self.api_headers, params=params)
response_length += len(response.json()[response_key])
except requests.exceptions.RequestException:
pass
elif pagination_key == 'offset':
try:
response = self.session.get(url, headers=self.api_headers, params=params)
response_length = len(response.json()[response_key])
while response.json()['has_more']:
params['offset'] += 25
response = self.session.get(url, headers=self.api_headers, params=params)
response_length += len(response.json()[response_key])
except requests.exceptions.RequestException:
pass
else:
try:
response = self.session.get(url, headers=self.api_headers, params=params)
response_length = len(response.json()[response_key]) if isinstance(response.json()[response_key], list) else \
response.json()[response_key]
except requests.exceptions.RequestException:
pass
end_time = time.time()
return {
'endpoint': endpoint,
'response_length': response_length,
'response_time': end_time - start_time
}
If the count is lower than the limit, and if you have a support contract, please file a support case so we can look further, as we may need more information from you.
โ08-16-2022 03:04 AM
Thanks Ian,
I only get the first page of the repos list. I can only recognise a few of them, and the rest of the repos are in internal path.
```
"repos": [
{
"id": {{ id }},
"path": "/Repos/.internal/.alias/f/{{ some_values }}/{{ some_values }}",
"url": {{ url }},
"provider": "{{ provider }}",
"head_commit_id": "{{ head_commit_id }}"
},
{
"id": {{ id }},
"path": "/Repos/{{ email }}/{{ repo_name }}",
"url": "{{ url }}",
"provider": "{{ provider }}",
"branch": "{{ branch }}",
"head_commit_id": "{{ head_commit_id }}"
},
{
"id": {{ id }},
"path": "/Repos/.internal/{{ some_values }}_commits/{{ head_commit_id }}",
"url": "{{ url }}",
"provider": "{{ provider }}",
"head_commit_id": "{{ head_commit_id }}"
},
```
I am using git repo as the source of the some scheduled jobs (which run every min). Perhaps these internal repos are created by the scheduled jobs.
Unfortunately, I don't have a support contract yet.
Is there any way I can get without the contract?
โ08-16-2022 03:22 AM
Thanks, @Kit Yam Tseโ -- do you have the actual count (including the /Repos/.internal ones, which are, you're correct, for the workflows)?
โ08-16-2022 04:28 AM
Just to clarify: we count both internal (from workflows, among others) and workspace repos to the 5K count. For workflows where the repos count are exceeded, execution is blocked initially for 10 minutes until the count is reduced. There is a cleanup process for finished tasks as well.
Does the job eventually get executed, or did it completely fail?
This is why it's important to get a complete repos count so we can check if this is the behaviour that you are seeing.
โ08-16-2022 02:20 AM
It seems that I have similar query, did you get the solution for this?
โ08-16-2022 05:08 AM
Hi, @Priscilla Maynard -- can you please send an email to help@databricks.com with more details? Thanks.
โ
@Kit Yam Tseโ -- we are checking this internally, and will keep you posted. Thanks for reporting this.โ
โ08-17-2022 12:39 AM
Just an update, to round this out.
We investigated further internally, and found that although we have a cleanup process in place to remove the internal repos that are being checked out for workflows, it was failing to catch up due to the sheer volume of jobs that were continuously failing during the repo checkout step (because of an invalid path).
This led to the limits being breached, and cascaded down to valid jobs not being able to launch.
We've worked with Kit to identify the errant job(s), and are now closely monitoring internal metrics, which currently show significant improvements.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group