<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Github workflow integration error in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/github-workflow-integration-error/m-p/33544#M24523</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;We have a &lt;U&gt;working&lt;/U&gt; Github integration in place for our production workspace which is running 14 different jobs that are scheduled during different intervals, but throughout the entire day.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The issue over the past 3-4 weeks that we have encountered is that consistently, once a week around the &lt;U&gt;weekend&lt;/U&gt;, our jobs throw two different errors typically for &lt;U&gt;1-3 hours&lt;/U&gt; throwing these two errors:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;Failed to checkout Git repository: PROJECTS_OPERATION_TIMEOUT: Timed out while performing operation. This may be due to a remote repo that is too large or a slow network. We do not recommend having more than 10000 notebooks in a repo.&lt;/CODE&gt;&lt;/PRE&gt;&lt;PRE&gt;&lt;CODE&gt;Failed to checkout Git repository: PERMISSION_DENIED: Could not connect to git server. Make sure the git server is accessible from Databricks. Connecting to a private git server requires additional setup. Please contact your Databricks representative for details.&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;We can't really find anything about these errors online or in the forum here. We've tried to increase the cluster variable &lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;spark.storage.blockManagerTimeoutIntervalMs&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt; but it didn't solve the issue.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Note! &lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Before and after the error period mentioned above everything works perfectly. It's for all our scheduled jobs under workflows.&lt;/LI&gt;&lt;LI&gt;Our repository is fairly small so the size isn't the issue, and we have the correct permissions which is illustrated by the working state throughout the rest of the week.&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks for any help in solving our issue!&lt;/P&gt;</description>
    <pubDate>Thu, 25 Aug 2022 06:50:39 GMT</pubDate>
    <dc:creator>Philip_Budbee</dc:creator>
    <dc:date>2022-08-25T06:50:39Z</dc:date>
    <item>
      <title>Github workflow integration error</title>
      <link>https://community.databricks.com/t5/data-engineering/github-workflow-integration-error/m-p/33544#M24523</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;We have a &lt;U&gt;working&lt;/U&gt; Github integration in place for our production workspace which is running 14 different jobs that are scheduled during different intervals, but throughout the entire day.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The issue over the past 3-4 weeks that we have encountered is that consistently, once a week around the &lt;U&gt;weekend&lt;/U&gt;, our jobs throw two different errors typically for &lt;U&gt;1-3 hours&lt;/U&gt; throwing these two errors:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;Failed to checkout Git repository: PROJECTS_OPERATION_TIMEOUT: Timed out while performing operation. This may be due to a remote repo that is too large or a slow network. We do not recommend having more than 10000 notebooks in a repo.&lt;/CODE&gt;&lt;/PRE&gt;&lt;PRE&gt;&lt;CODE&gt;Failed to checkout Git repository: PERMISSION_DENIED: Could not connect to git server. Make sure the git server is accessible from Databricks. Connecting to a private git server requires additional setup. Please contact your Databricks representative for details.&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;We can't really find anything about these errors online or in the forum here. We've tried to increase the cluster variable &lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;spark.storage.blockManagerTimeoutIntervalMs&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt; but it didn't solve the issue.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Note! &lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Before and after the error period mentioned above everything works perfectly. It's for all our scheduled jobs under workflows.&lt;/LI&gt;&lt;LI&gt;Our repository is fairly small so the size isn't the issue, and we have the correct permissions which is illustrated by the working state throughout the rest of the week.&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks for any help in solving our issue!&lt;/P&gt;</description>
      <pubDate>Thu, 25 Aug 2022 06:50:39 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/github-workflow-integration-error/m-p/33544#M24523</guid>
      <dc:creator>Philip_Budbee</dc:creator>
      <dc:date>2022-08-25T06:50:39Z</dc:date>
    </item>
  </channel>
</rss>

