05-18-2023 08:57 PM
Hello,
I 'm trying to execute databricks notebook form a python source code but getting error.
source code below
------------------
from databricks_api import DatabricksAPI
# Create a Databricks API client
api = DatabricksAPI(host='databrick_host', token='access token')
# Define the notebook path
notebook_path = '/Users/xyz@abc.com/data_restoration'
print("A")
# Create a new run
run = api.jobs.run_now(notebook_path)
print("B)")
# Wait for the run to complete
run.wait_for_completion()
# Get the run output
output = api.jobs.get_run_output(run.run_id)
# Print the output
for entry in output:
print(entry['data'])
Output
------
A
Traceback (most recent call last):
File "C:\Users\sshiv\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "C:\Users\sshiv\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\util\connection.py", line 72, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "C:\Users\sshiv\AppData\Local\Programs\Python\Python39\lib\socket.py", line 954, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11001] getaddrinfo failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\sshiv\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "C:\Users\sshiv\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "C:\Users\sshiv\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 1042, in _validate_conn
conn.connect()
File "C:\Users\sshiv\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connection.py", line 363, in connect
self.sock = conn = self._new_conn()
File "C:\Users\sshiv\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x000001F2B2A244C0>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed
Process finished with exit code 1
05-25-2023 03:10 AM
is your databricks workspace available to your python env? (read: firewall, vNet etc)
05-31-2023 05:50 AM
This issue is usually caused by an incorrect hostname or issues with network connectivity.
To resolve this issue, please follow these steps:
1. Double-check your Databricks host URL. Make sure it is spelled correctly and follows the proper format, such as `https://<region>.azuredatabricks.net`. Replace `<region>` with your Databricks workspace region.
2. Verify that you have a stable internet connection and that there are no firewalls or proxies blocking your access to the Databricks host URL.
3. Ensure that your access token is valid and has the necessary permissions to execute the notebook.
05-31-2023 10:45 AM
I found this youtube video beneficial for setting up my environment with VScode.
11-09-2023 11:54 AM
The error you are encountering indicates that there is an issue with establishing a connection to the Databricks host specified in your code. Specifically, the error message "getaddrinfo failed" suggests that the hostname or IP address you provided for the Databricks host cannot be resolved.
Here are a few things you can check and address:
Host Name: Ensure that you have provided the correct Databricks host name in the DatabricksAPI initialization. It should be in the format https://<your-databricks-instance>.cloud.databricks.com. Make sure there are no typos or mistakes in the host name.
Network Connectivity: Ensure that your Python environment can access the Databricks host. Check if your system can reach the Databricks host over the network. You can test this by opening a web browser and trying to access the Databricks host's URL.
Firewalls and Network Restrictions: If you are running the code on a corporate network or behind a firewall, there may be network restrictions that prevent your Python environment from connecting to external hosts. Check if any firewall or proxy settings are blocking the connection.
Proxy Settings: If your network requires a proxy to access external resources, make sure your Python environment is configured with the correct proxy settings. You can configure proxy settings in Python using the HTTP_PROXY and HTTPS_PROXY environment variables.
Token: Ensure that you have provided a valid Databricks access token. If the token is incorrect or expired, it can also lead to connection issues.
Databricks Instance Status: Verify that your Databricks instance is up and running without any issues. Sometimes, temporary outages or maintenance can affect the availability of Databricks services.
API Version: Make sure you are using a compatible version of the databricks_api library with your Databricks instance. Check for library updates if necessary.
By addressing these points, you should be able to resolve the issue and successfully establish a connection to your Databricks host from your Python code.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group