07-28-2022 10:46 PM
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
drivers = webdriver.Chrome(ChromeDriverManager().install())
drivers.get("https://www.google.co.in/")
drivers.find_element(By.NAME, "q").send_keys("Prabhakar Kumar Jha")
Error message
WebDriverException: Message: Service /root/.wdm/drivers/chromedriver/linux64/103.0.5060/chromedriver unexpectedly exited. Status code was: 127
07-29-2022 01:31 AM
Hi @Prabhakar Jha
In the clusters library tab, please install PyPi chromedriver-binary==83.0 (or higher, probably version in the script can also be updated)
To cluster init script, please add /databricks/scripts/selenium-install.sh, which you create using the below code.
Then, in the databricks notebook code, please use something similar to the second code below.
dbutils.fs.mkdirs("dbfs:/databricks/scripts/")
dbutils.fs.put("/databricks/scripts/selenium-install.sh","""
#!/bin/bash
apt-get update
apt-get install chromium-browser=91.0.4472.101-0ubuntu0.18.04.1 --yes
wget https://chromedriver.storage.googleapis.com/91.0.4472.101/chromedriver_linux64.zip -O /tmp/chromedriver.zip
mkdir /tmp/chromedriver
unzip /tmp/chromedriver.zip -d /tmp/chromedriver/
""", True)
display(dbutils.fs.ls("dbfs:/databricks/scripts/"))
from selenium import webdriver
chrome_driver = '/tmp/chromedriver/chromedriver'
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--headless')
# chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--homedir=/dbfs/tmp')
chrome_options.add_argument('--user-data-dir=/dbfs/selenium')
# prefs = {"download.default_directory":"/dbfs/tmp",
# "download.prompt_for_download":False
# }
# chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(executable_path=chrome_driver, options=chrome_options)
08-16-2022 04:31 AM
Hi Hubert,
Resolve that issue, but I got a new issue after this:
from selenium import webdriver
chrome_driver = '/tmp/chromedriver/chromedriver'
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--homedir=/dbfs/tmp')
chrome_options.add_argument('--user-data-dir=/dbfs/selenium')
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
driver.get('https://www.google.com/')
But the error is:
Message: unknown error: Chrome failed to start: exited abnormally.
(unknown error: DevToolsActivePort file doesn't exist)
(The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
09-18-2022 09:47 AM
Hi Prabhakar and @Hubert Dudek,
Did you find out a solution to your new issue "Message: unknown error: Chrome failed to start: exited abnormally." ?
I am stuck with the same error. I tried to have selenium works with Databricks but without success.
11-09-2022 06:26 AM
Hi, @Prabhakar Jha I've created a new version of the selenium with the databricks manual. Please look here https://community.databricks.com/s/feed/0D58Y00009SWgVuSAL
06-21-2023 08:44 AM
I also got that error. What worked for me was downloading the chrome driver and ensuring its the latest version with shell scripts in the same notebook I used for web scraping. I could not use the web driver manager. You can see all the details here: https://stackoverflow.com/questions/69192050/using-selenium-within-databricks-chrome-not-reachable/7...
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now