07-28-2022 10:46 PM
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
drivers = webdriver.Chrome(ChromeDriverManager().install())
drivers.get("https://www.google.co.in/")
drivers.find_element(By.NAME, "q").send_keys("Prabhakar Kumar Jha")
Error message
WebDriverException: Message: Service /root/.wdm/drivers/chromedriver/linux64/103.0.5060/chromedriver unexpectedly exited. Status code was: 127
07-29-2022 01:31 AM
Hi @Prabhakar Jha
In the clusters library tab, please install PyPi chromedriver-binary==83.0 (or higher, probably version in the script can also be updated)
To cluster init script, please add /databricks/scripts/selenium-install.sh, which you create using the below code.
Then, in the databricks notebook code, please use something similar to the second code below.
dbutils.fs.mkdirs("dbfs:/databricks/scripts/")
dbutils.fs.put("/databricks/scripts/selenium-install.sh","""
#!/bin/bash
apt-get update
apt-get install chromium-browser=91.0.4472.101-0ubuntu0.18.04.1 --yes
wget https://chromedriver.storage.googleapis.com/91.0.4472.101/chromedriver_linux64.zip -O /tmp/chromedriver.zip
mkdir /tmp/chromedriver
unzip /tmp/chromedriver.zip -d /tmp/chromedriver/
""", True)
display(dbutils.fs.ls("dbfs:/databricks/scripts/"))
from selenium import webdriver
chrome_driver = '/tmp/chromedriver/chromedriver'
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--headless')
# chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--homedir=/dbfs/tmp')
chrome_options.add_argument('--user-data-dir=/dbfs/selenium')
# prefs = {"download.default_directory":"/dbfs/tmp",
# "download.prompt_for_download":False
# }
# chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(executable_path=chrome_driver, options=chrome_options)
08-16-2022 04:31 AM
Hi Hubert,
Resolve that issue, but I got a new issue after this:
from selenium import webdriver
chrome_driver = '/tmp/chromedriver/chromedriver'
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--homedir=/dbfs/tmp')
chrome_options.add_argument('--user-data-dir=/dbfs/selenium')
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
driver.get('https://www.google.com/')
But the error is:
Message: unknown error: Chrome failed to start: exited abnormally.
(unknown error: DevToolsActivePort file doesn't exist)
(The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
09-18-2022 09:47 AM
Hi Prabhakar and @Hubert Dudek,
Did you find out a solution to your new issue "Message: unknown error: Chrome failed to start: exited abnormally." ?
I am stuck with the same error. I tried to have selenium works with Databricks but without success.
11-09-2022 06:26 AM
Hi, @Prabhakar Jha I've created a new version of the selenium with the databricks manual. Please look here https://community.databricks.com/s/feed/0D58Y00009SWgVuSAL
06-21-2023 08:44 AM
I also got that error. What worked for me was downloading the chrome driver and ensuring its the latest version with shell scripts in the same notebook I used for web scraping. I could not use the web driver manager. You can see all the details here: https://stackoverflow.com/questions/69192050/using-selenium-within-databricks-chrome-not-reachable/7...
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group