<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Failed to fetch archive.ubuntu in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/failed-to-fetch-archive-ubuntu/m-p/26743#M18759</link>
    <description>&lt;P&gt;Hi, @Dagart Allison​&amp;nbsp;. I've created a new version of the selenium with the databricks manual. Please look here &lt;A href="https://community.databricks.com/s/feed/0D58Y00009SWgVuSAL" alt="https://community.databricks.com/s/feed/0D58Y00009SWgVuSAL" target="_blank"&gt;https://community.databricks.com/s/feed/0D58Y00009SWgVuSAL&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 09 Nov 2022 14:26:28 GMT</pubDate>
    <dc:creator>Hubert-Dudek</dc:creator>
    <dc:date>2022-11-09T14:26:28Z</dc:date>
    <item>
      <title>Failed to fetch archive.ubuntu</title>
      <link>https://community.databricks.com/t5/data-engineering/failed-to-fetch-archive-ubuntu/m-p/26736#M18752</link>
      <description>&lt;P&gt;I am trying to use selenium webdriver to do a scraping project in Databricks. The notebook used to run properly but now has an issue with the &lt;/P&gt;&lt;P&gt;Get:1 &lt;A href="http://archive.ubuntu.com/ubuntu" target="test_blank"&gt;http://archive.ubuntu.com/ubuntu&lt;/A&gt; focal/main amd64 fonts-liberation all 1:1.07.4-11 [822 kB]&lt;/P&gt;&lt;P&gt;command .&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In the cells prior to this, I run the following commands:&lt;/P&gt;&lt;P&gt;apt-get clean &amp;amp;&amp;amp; sudo apt-get -y upgrade&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;sudo apt-get install -y&lt;/P&gt;&lt;P&gt;apt install libnss -y&lt;/P&gt;&lt;P&gt;apt install libnss3-dev libgdk-pixbuf2.0-dev libgtk-3-dev libxss-dev -y&lt;/P&gt;&lt;P&gt;sudo apt-get update &amp;amp;&amp;amp; sudo apt-get install -y gconf-service libasound2 libatk1.0-0 libc6 libcairo2 libcups2 libdbus-1-3 libexpat1 libfontconfig1 libgcc1 libgconf-2-4 libgdk-pixbuf2.0-0 libglib2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libpangocairo-1.0-0 libstdc++6 libx11-6 libx11-xcb1 libxcb1 libxcomposite1 libxcursor1 libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 libxrender1 libxss1 libxtst6 ca-certificates fonts-liberation libnss3 lsb-release xdg-utils wget ca-certificates google-chrome-stable libgbm1 libu2f-udev libwayland-server0 udev&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I attached the cell that fails and the error message. If you have any suggestions please let me know.&lt;/P&gt;</description>
      <pubDate>Tue, 18 Oct 2022 16:46:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/failed-to-fetch-archive-ubuntu/m-p/26736#M18752</guid>
      <dc:creator>Tripalink</dc:creator>
      <dc:date>2022-10-18T16:46:10Z</dc:date>
    </item>
    <item>
      <title>Re: Failed to fetch archive.ubuntu</title>
      <link>https://community.databricks.com/t5/data-engineering/failed-to-fetch-archive-ubuntu/m-p/26737#M18753</link>
      <description>&lt;P&gt;Maybe my manual on how to run selenium on Databricks will help:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In the clusters library tab, please install &lt;B&gt;PyPi chromedriver-binary==83.0&lt;/B&gt; (or higher, probably version in the script can also be updated)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Please run below script from notebook to create "/databricks/scripts/&lt;A href="https://selenium-install.sh" alt="https://selenium-install.sh" target="_blank"&gt;selenium-install.sh&lt;/A&gt;" file.&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;    dbutils.fs.mkdirs("dbfs:/databricks/scripts/")
    dbutils.fs.put("/databricks/scripts/selenium-install.sh","""
    #!/bin/bash
    apt-get update
    apt-get install chromium-browser=91.0.4472.101-0ubuntu0.18.04.1 --yes
    wget &lt;A href="https://chromedriver.storage.googleapis.com/91.0.4472.101/chromedriver_linux64.zip" target="test_blank"&gt;https://chromedriver.storage.googleapis.com/91.0.4472.101/chromedriver_linux64.zip&lt;/A&gt; -O /tmp/chromedriver.zip
    mkdir /tmp/chromedriver
    unzip /tmp/chromedriver.zip -d /tmp/chromedriver/
    """, True)
    display(dbutils.fs.ls("dbfs:/databricks/scripts/"))&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Please add "/databricks/scripts/&lt;A href="https://selenium-install.sh" alt="https://selenium-install.sh" target="_blank"&gt;selenium-install.sh&lt;/A&gt;" as starting script - init in cluster config.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Later in the notebook, you can use chrome, as in the below example.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;    from selenium import webdriver
    chrome_driver = '/tmp/chromedriver/chromedriver'
    chrome_options = webdriver.ChromeOptions()
    chrome_options.add_argument('--no-sandbox')
    chrome_options.add_argument('--headless')
    # chrome_options.add_argument('--disable-dev-shm-usage') 
    chrome_options.add_argument('--homedir=/dbfs/tmp')
    chrome_options.add_argument('--user-data-dir=/dbfs/selenium')
    # prefs = {"download.default_directory":"/dbfs/tmp",
    #          "download.prompt_for_download":False
    # }
    # chrome_options.add_experimental_option("prefs",prefs)
    driver = webdriver.Chrome(executable_path=chrome_driver, options=chrome_options)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 18 Oct 2022 18:06:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/failed-to-fetch-archive-ubuntu/m-p/26737#M18753</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-10-18T18:06:32Z</dc:date>
    </item>
    <item>
      <title>Re: Failed to fetch archive.ubuntu</title>
      <link>https://community.databricks.com/t5/data-engineering/failed-to-fetch-archive-ubuntu/m-p/26738#M18754</link>
      <description>&lt;P&gt;I got an error from the second line of the install script&lt;/P&gt;</description>
      <pubDate>Tue, 18 Oct 2022 22:57:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/failed-to-fetch-archive-ubuntu/m-p/26738#M18754</guid>
      <dc:creator>Tripalink</dc:creator>
      <dc:date>2022-10-18T22:57:13Z</dc:date>
    </item>
    <item>
      <title>Re: Failed to fetch archive.ubuntu</title>
      <link>https://community.databricks.com/t5/data-engineering/failed-to-fetch-archive-ubuntu/m-p/26741#M18757</link>
      <description>&lt;P&gt;Hi, I still get the same error as I previously posted about the chromium-browser not found for that version.&lt;/P&gt;</description>
      <pubDate>Wed, 19 Oct 2022 17:40:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/failed-to-fetch-archive-ubuntu/m-p/26741#M18757</guid>
      <dc:creator>Tripalink</dc:creator>
      <dc:date>2022-10-19T17:40:46Z</dc:date>
    </item>
    <item>
      <title>Re: Failed to fetch archive.ubuntu</title>
      <link>https://community.databricks.com/t5/data-engineering/failed-to-fetch-archive-ubuntu/m-p/26742#M18758</link>
      <description>&lt;P&gt;Here is what was added to the notebook to get it to run properly:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="to get google-chrome and the ubuntu version to properly install"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/1344iCB92A5D83ECF82F5/image-size/large?v=v2&amp;amp;px=999" role="button" title="to get google-chrome and the ubuntu version to properly install" alt="to get google-chrome and the ubuntu version to properly install" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 24 Oct 2022 17:29:42 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/failed-to-fetch-archive-ubuntu/m-p/26742#M18758</guid>
      <dc:creator>Tripalink</dc:creator>
      <dc:date>2022-10-24T17:29:42Z</dc:date>
    </item>
    <item>
      <title>Re: Failed to fetch archive.ubuntu</title>
      <link>https://community.databricks.com/t5/data-engineering/failed-to-fetch-archive-ubuntu/m-p/26743#M18759</link>
      <description>&lt;P&gt;Hi, @Dagart Allison​&amp;nbsp;. I've created a new version of the selenium with the databricks manual. Please look here &lt;A href="https://community.databricks.com/s/feed/0D58Y00009SWgVuSAL" alt="https://community.databricks.com/s/feed/0D58Y00009SWgVuSAL" target="_blank"&gt;https://community.databricks.com/s/feed/0D58Y00009SWgVuSAL&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 09 Nov 2022 14:26:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/failed-to-fetch-archive-ubuntu/m-p/26743#M18759</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-11-09T14:26:28Z</dc:date>
    </item>
    <item>
      <title>Re: Failed to fetch archive.ubuntu</title>
      <link>https://community.databricks.com/t5/data-engineering/failed-to-fetch-archive-ubuntu/m-p/26739#M18755</link>
      <description>&lt;P&gt;Hi @Dagart Allison​&amp;nbsp;, With apt-get upgrade, could you please run apt-get update in the previous cell? &lt;/P&gt;&lt;P&gt;Also, you can try apt-get install (package-name) --fix-missing. &lt;/P&gt;</description>
      <pubDate>Wed, 19 Oct 2022 07:31:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/failed-to-fetch-archive-ubuntu/m-p/26739#M18755</guid>
      <dc:creator>Debayan</dc:creator>
      <dc:date>2022-10-19T07:31:04Z</dc:date>
    </item>
  </channel>
</rss>

