cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

[DATA_SOURCE_NOT_FOUND] Failed to find data source

PabloCSD
Contributor

Context:

Hello, I was using a workflow for a periodic process, with my team we were using a Job Compute, but the libraries were not working (even though we had a PIP_EXTRA_INDEX_URL defined in the Environment Variables of the Cluster, so we now use a workaround where we generated a cluster and we manually installed each library in the libraries section of the cluster.

Problem:

 

Py4JJavaError: An error occurred while calling o1160.save.
: org.apache.spark.SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND] Failed to find data source: com.microsoft.sqlserver.jdbc.spark. Please find packages at `https://spark.apache.org/third-party-projects.html`. SQLSTATE: 42K02

 

Also when I check the website there is nothing, what do you recommend?

 

 

1 REPLY 1

PabloCSD
Contributor

I installed in the cluster this library:

spark_mssql_connector_2_12_1_4_0_BETA.jar

A colleague passed me this .jar file. It seems that can be obtained from here: https://github.com/microsoft/sql-spark-connector/releases.

This allows the task to end succesfully being a way for fixing this error.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group