cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Lazloo
by New Contributor III
  • 15625 Views
  • 6 replies
  • 4 kudos

databricks-connect version 13: spark-class2.cmd not found

I install the newest version "databricks-connect==13.0.0". Now get the issue    Command C:\Users\Y\AppData\Local\pypoetry\Cache\virtualenvs\X-py3.9\Lib\site-packages\pyspark\bin\spark-class2.cmd"" not found   konnte nicht gefunden werden.   Traceback...

  • 15625 Views
  • 6 replies
  • 4 kudos
Latest Reply
Susumu_Asaga
New Contributor II
  • 4 kudos

Use this code:from databricks.connect import DatabricksSession spark = DatabricksSession.builder.getOrCreate() 

  • 4 kudos
5 More Replies
Shadowsong27
by New Contributor III
  • 12794 Views
  • 11 replies
  • 4 kudos

Resolved! Mongo Spark Connector 3.0.1 seems not working with Databricks-Connect, but works fine in Databricks Cloud

On latest DB-Connect==9.1.3 and dbr == 9.1, retrieving data from mongo using Maven coordinate of Mongo Spark Connector: org.mongodb.spark:mongo-spark-connector_2.12:3.0.1 - https://docs.mongodb.com/spark-connector/current/ - working fine previously t...

  • 12794 Views
  • 11 replies
  • 4 kudos
Latest Reply
mehdi3x
New Contributor II
  • 4 kudos

Hi everyone the solution for me it was to replace spark.read.format("mongo") by spark.read.format("mongodb") my spark version is 3.3.2 and my mongodb version is 6.0.6 . 

  • 4 kudos
10 More Replies
hamzatazib96
by New Contributor III
  • 63473 Views
  • 21 replies
  • 12 kudos

Resolved! Read file from dbfs with pd.read_csv() using databricks-connect

Hello all, As described in the title, here's my problem: 1. I'm using databricks-connect in order to send jobs to a databricks cluster 2. The "local" environment is an AWS EC2 3. I want to read a CSV file that is in DBFS (databricks) with pd.read_cs...

  • 63473 Views
  • 21 replies
  • 12 kudos
Latest Reply
so16
New Contributor II
  • 12 kudos

Please guys I need your help, I got the same issue still after readed all your comments.I am using Databricks-connect(version 13.1) on pycharm and trying to load file that are on the dbfs storage.spark = DatabricksSession.builder.remote( host=host...

  • 12 kudos
20 More Replies
JLCDA
by New Contributor
  • 2268 Views
  • 2 replies
  • 0 kudos

databricks-connect 9.1 : StreamCorruptedException: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe

Hello, I'm using databricks-connect 9.1 and I started having issues since last week in all functions that have a "collect()". Everything was working before : myList = df1.select("id").rdd.flatMap(lambda x: x).collect()here the error : py4j.protocol.P...

  • 2268 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Julien Larcher​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answer...

  • 0 kudos
1 More Replies
DBJmet
by New Contributor
  • 1995 Views
  • 2 replies
  • 0 kudos

Databricks-Connect Error occurred while running *** java.io.StreamCorruptedException: invalid type code: 00

I am using databricks-connect to access a remote cluster. Everything works as expected and I can set breakpoints and interrogate the results, same for when it trys to execute the following code:val testDF = spark.createDataFrame(spark.sparkContext .e...

  • 1995 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @James Metcalf​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers...

  • 0 kudos
1 More Replies
JordiDekker
by New Contributor III
  • 2978 Views
  • 5 replies
  • 6 kudos

StreamCorruptedException, databricks-connect 9.1

Last week, around the 21st of march, we started having issues with databricks-connect (DBR 9.1 LTS). "databricks-connect test" works, but the following code snippet:from pyspark.sql import SparkSession     spark = SparkSession.builder.getOrCreate() s...

  • 2978 Views
  • 5 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @Jordi Dekker​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

  • 6 kudos
4 More Replies
Anonymous
by Not applicable
  • 5594 Views
  • 1 replies
  • 1 kudos

Databricks-connect configured with service principal token but unable to retrieve information to local machine

installed databricks-connect and configured with service principal token, able to start cluster when I use command spark=SparkSession\.builder\.getOrCreate() But when trying to retrieve s3 bucket data to local machine or even i run test command ex...

  • 5594 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @divya08Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 1 kudos
ramravi
by Contributor II
  • 2190 Views
  • 1 replies
  • 0 kudos

Unable to connect to databricks cluster from Windows using databricks-connect

I am trying to setup databricks-connect in my windows machine. While doing databricks-connect test I am getting the below error complaining java certificate is not found. ''Caused by: sun.security.validator.ValidatorException: PKIX path building fail...

cer
  • 2190 Views
  • 1 replies
  • 0 kudos
Latest Reply
ramravi
Contributor II
  • 0 kudos

Adding the certificate from the root level worked for me. This problem is solved.

  • 0 kudos
Lazloo
by New Contributor III
  • 1007 Views
  • 0 replies
  • 2 kudos

Cannot load spark-avro jars with databricksversion 10.4

Currently, I am facing an issue since the `databricks-connect` runtime on our cluster was updated to 10.4. Since then, I cannot load the jars for spark-avro anymore. By Running the following code from pyspark.sql import SparkSession   spark = SparkSe...

  • 1007 Views
  • 0 replies
  • 2 kudos
KarimSegura
by New Contributor III
  • 3013 Views
  • 2 replies
  • 4 kudos

databricks-connect throws an exception when showing a dataframe with json content

I'm facing an issue when I want to show a dataframe with JSON content.All this happens when the script runs in databricks-connect from VS Code.Basically, I would like any help or guidance to get this run as it should be. Thanks in advance.This is how...

  • 3013 Views
  • 2 replies
  • 4 kudos
Latest Reply
KarimSegura
New Contributor III
  • 4 kudos

The code works fine on databricks cluster, but this code is part of a unit test in local env. then submitted to a branch->PR->merged into master branch.Thanks for the advice on using DBX. I will give DBX a try again even though I've already tried.I'l...

  • 4 kudos
1 More Replies
vk217
by Contributor
  • 4596 Views
  • 3 replies
  • 1 kudos

Resolved! ERROR: No matching distribution found for databricks-connect==7.3.34

Previously, our databricks-connect was using 7.3.34 and the builds in pipenv and the builds were successful. As of today the builds are failing with error that the version 7.3.34 no longer exists.Is there a reason this version is no longer supported....

  • 4596 Views
  • 3 replies
  • 1 kudos
Latest Reply
Atanu
Databricks Employee
  • 1 kudos

Hello @Vikas B​ this is the release note -https://docs.databricks.com/release-notes/dbconnect/index.htmlalso,Only the following Databricks Runtime versions are supported:Databricks Runtime 10.4 LTS ML, Databricks Runtime 10.4 LTSDatabricks Runtime 9....

  • 1 kudos
2 More Replies
korilium
by New Contributor III
  • 10172 Views
  • 9 replies
  • 3 kudos

Databricks-connect invalid shard address

I want to use databricks inside vscode and I therefore need Databricks-connect I configure my settings using databricks-connect configure as follows: Databricks Host [https://adb-1409757184094616.16.azuredatabricks.net]Databricks Token [<my token>]Cl...

  • 10172 Views
  • 9 replies
  • 3 kudos
Latest Reply
Justin09
New Contributor II
  • 3 kudos

In case it helps anyone, I ran into this issue and had to remove the trailing / from the host name. It used to work fine with the trailing / so something must have changed.

  • 3 kudos
8 More Replies
JohanRex
by New Contributor II
  • 5427 Views
  • 3 replies
  • 5 kudos

Resolved! IllegalArgumentException: requirement failed: Result for RPC Some(e100cace-3836-4461-8902-80b3744fcb6b) lost, please retry your request.

I'm using databricks connect to talk to a cluster on Azure. When doing a count on a dataframe I sometimes get this error message. Once I've gotten it once I don't seem to be able to get rid of it even if I restart my dev environment. ----------------...

  • 5427 Views
  • 3 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Hi @Johan Rex​ We checked with databricks connect team, this issue can happen when the library is too large to upload, Databricks recommends that you use dbx by Databricks Labs for local development instead of Databricks Connect. Databricks plans no ...

  • 5 kudos
2 More Replies
Databach
by New Contributor
  • 3578 Views
  • 0 replies
  • 0 kudos

How to resolve "java.lang.ClassNotFoundException: com.databricks.spark.util.RegexBasedAWSSecretKeyRedactor" when running Scala Spark project using databricks-connect ?

Currently I am learning how to use databricks-connect to develop Scala code using IDE (VS Code) locally. The set-up of the databricks-connect as described here https://docs.microsoft.com/en-us/azure/databricks/dev-tools/databricks-connect was succues...

image build.sbt
  • 3578 Views
  • 0 replies
  • 0 kudos
Ian
by New Contributor III
  • 7558 Views
  • 4 replies
  • 0 kudos

Resolved! Databricks-Connect and Change Data Feed query error

I have installed Databricks-Connect (9.1 LTS). I am able to send queries to the cluster. However, when the query includes a call to the 'table_changes' function that is a part of Change Data Feed, I get the following error:AnalysisException("could ...

  • 7558 Views
  • 4 replies
  • 0 kudos
Latest Reply
Ian
New Contributor III
  • 0 kudos

Hi @Kaniz Fatma​ , the table_changes function is an internal Databricks function used in Change Data Feed (CDF).Please refer to the article below. It discusses the table_changes function.https://docs.databricks.com/delta/delta-change-data-feed.html

  • 0 kudos
3 More Replies
Labels