- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-30-2024 08:37 AM
Context:
Hello, I was using a workflow for a periodic process, with my team we were using a Job Compute, but the libraries were not working (even though we had a PIP_EXTRA_INDEX_URL defined in the Environment Variables of the Cluster, so we now use a workaround where we generated a cluster and we manually installed each library in the libraries section of the cluster.
Problem:
Py4JJavaError: An error occurred while calling o1160.save.
: org.apache.spark.SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND] Failed to find data source: com.microsoft.sqlserver.jdbc.spark. Please find packages at `https://spark.apache.org/third-party-projects.html`. SQLSTATE: 42K02
Also when I check the website there is nothing, what do you recommend?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-30-2024 10:42 AM
I installed in the cluster this library:
spark_mssql_connector_2_12_1_4_0_BETA.jar
A colleague passed me this .jar file. It seems that can be obtained from here: https://github.com/microsoft/sql-spark-connector/releases.
This allows the task to end succesfully being a way for fixing this error.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-30-2024 10:42 AM
I installed in the cluster this library:
spark_mssql_connector_2_12_1_4_0_BETA.jar
A colleague passed me this .jar file. It seems that can be obtained from here: https://github.com/microsoft/sql-spark-connector/releases.
This allows the task to end succesfully being a way for fixing this error.