04-07-2022 11:01 PM
I am working on a Webscraping logic and need to install Chrome driver. How can I install it in the Databricks workbook ?
06-21-2023 08:29 AM
What worked for me was downloading the chrome driver and ensuring its the latest version with shell scripts in the same notebook I used for web scraping. You can see all the details here: https://stackoverflow.com/questions/69192050/using-selenium-within-databricks-chrome-not-reachable/7...
04-11-2022 01:47 PM
@Abhishek Jain , In the clusters library tab, please install PyPi chromedriver-binary==83.0 (or higher, probably version in the script can also be updated)
To cluster init script, please add /databricks/scripts/selenium-install.sh, which you create using the below code.
Then, in the databricks notebook code, please use something similar to the second code below.
dbutils.fs.mkdirs("dbfs:/databricks/scripts/")
dbutils.fs.put("/databricks/scripts/selenium-install.sh","""
#!/bin/bash
apt-get update
apt-get install chromium-browser=91.0.4472.101-0ubuntu0.18.04.1 --yes
wget https://chromedriver.storage.googleapis.com/91.0.4472.101/chromedriver_linux64.zip -O /tmp/chromedriver.zip
mkdir /tmp/chromedriver
unzip /tmp/chromedriver.zip -d /tmp/chromedriver/
""", True)
display(dbutils.fs.ls("dbfs:/databricks/scripts/"))
from selenium import webdriver
chrome_driver = '/tmp/chromedriver/chromedriver'
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--headless')
# chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--homedir=/dbfs/tmp')
chrome_options.add_argument('--user-data-dir=/dbfs/selenium')
# prefs = {"download.default_directory":"/dbfs/tmp",
# "download.prompt_for_download":False
# }
# chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(executable_path=chrome_driver, options=chrome_options)
06-20-2022 05:02 PM
11-09-2022 06:27 AM
Hi, @Gustavo Queiroz I've created a new version of the selenium with the databricks manual. Please look here https://community.databricks.com/s/feed/0D58Y00009SWgVuSAL
06-21-2023 08:29 AM
What worked for me was downloading the chrome driver and ensuring its the latest version with shell scripts in the same notebook I used for web scraping. You can see all the details here: https://stackoverflow.com/questions/69192050/using-selenium-within-databricks-chrome-not-reachable/7...
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group