Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
I install the newest version "databricks-connect==13.0.0". Now get the issue Command C:\Users\Y\AppData\Local\pypoetry\Cache\virtualenvs\X-py3.9\Lib\site-packages\pyspark\bin\spark-class2.cmd"" not found konnte nicht gefunden werden. Traceback...
On latest DB-Connect==9.1.3 and dbr == 9.1, retrieving data from mongo using Maven coordinate of Mongo Spark Connector: org.mongodb.spark:mongo-spark-connector_2.12:3.0.1 - https://docs.mongodb.com/spark-connector/current/ - working fine previously t...
Hi everyone the solution for me it was to replace spark.read.format("mongo") by spark.read.format("mongodb") my spark version is 3.3.2 and my mongodb version is 6.0.6 .
Hello all,
As described in the title, here's my problem:
1. I'm using databricks-connect in order to send jobs to a databricks cluster
2. The "local" environment is an AWS EC2
3. I want to read a CSV file that is in DBFS (databricks) with pd.read_cs...
Please guys I need your help, I got the same issue still after readed all your comments.I am using Databricks-connect(version 13.1) on pycharm and trying to load file that are on the dbfs storage.spark = DatabricksSession.builder.remote( host=host...
Hello, I'm using databricks-connect 9.1 and I started having issues since last week in all functions that have a "collect()". Everything was working before : myList = df1.select("id").rdd.flatMap(lambda x: x).collect()here the error : py4j.protocol.P...
Hi @Julien Larcher Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answer...
I am using databricks-connect to access a remote cluster. Everything works as expected and I can set breakpoints and interrogate the results, same for when it trys to execute the following code:val testDF = spark.createDataFrame(spark.sparkContext .e...
Hi @James Metcalf Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers...
Last week, around the 21st of march, we started having issues with databricks-connect (DBR 9.1 LTS). "databricks-connect test" works, but the following code snippet:from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
s...
Hi @Jordi Dekker Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...
installed databricks-connect and configured with service principal token, able to start cluster when I use command spark=SparkSession\.builder\.getOrCreate() But when trying to retrieve s3 bucket data to local machine or even i run test command ex...
Hi @divya08Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!
I am trying to setup databricks-connect in my windows machine. While doing databricks-connect test I am getting the below error complaining java certificate is not found. ''Caused by: sun.security.validator.ValidatorException: PKIX path building fail...
Currently, I am facing an issue since the `databricks-connect` runtime on our cluster was updated to 10.4. Since then, I cannot load the jars for spark-avro anymore. By Running the following code from pyspark.sql import SparkSession
spark = SparkSe...
I'm facing an issue when I want to show a dataframe with JSON content.All this happens when the script runs in databricks-connect from VS Code.Basically, I would like any help or guidance to get this run as it should be. Thanks in advance.This is how...
The code works fine on databricks cluster, but this code is part of a unit test in local env. then submitted to a branch->PR->merged into master branch.Thanks for the advice on using DBX. I will give DBX a try again even though I've already tried.I'l...
Previously, our databricks-connect was using 7.3.34 and the builds in pipenv and the builds were successful. As of today the builds are failing with error that the version 7.3.34 no longer exists.Is there a reason this version is no longer supported....
Hello @Vikas B this is the release note -https://docs.databricks.com/release-notes/dbconnect/index.htmlalso,Only the following Databricks Runtime versions are supported:Databricks Runtime 10.4 LTS ML, Databricks Runtime 10.4 LTSDatabricks Runtime 9....
I want to use databricks inside vscode and I therefore need Databricks-connect I configure my settings using databricks-connect configure as follows: Databricks Host [https://adb-1409757184094616.16.azuredatabricks.net]Databricks Token [<my token>]Cl...
In case it helps anyone, I ran into this issue and had to remove the trailing / from the host name. It used to work fine with the trailing / so something must have changed.
I'm using databricks connect to talk to a cluster on Azure. When doing a count on a dataframe I sometimes get this error message. Once I've gotten it once I don't seem to be able to get rid of it even if I restart my dev environment. ----------------...
Hi @Johan Rex We checked with databricks connect team, this issue can happen when the library is too large to upload, Databricks recommends that you use dbx by Databricks Labs for local development instead of Databricks Connect. Databricks plans no ...
Currently I am learning how to use databricks-connect to develop Scala code using IDE (VS Code) locally. The set-up of the databricks-connect as described here https://docs.microsoft.com/en-us/azure/databricks/dev-tools/databricks-connect was succues...
I have installed Databricks-Connect (9.1 LTS). I am able to send queries to the cluster. However, when the query includes a call to the 'table_changes' function that is a part of Change Data Feed, I get the following error:AnalysisException("could ...
Hi @Kaniz Fatma , the table_changes function is an internal Databricks function used in Change Data Feed (CDF).Please refer to the article below. It discusses the table_changes function.https://docs.databricks.com/delta/delta-change-data-feed.html