Install maven package to serverless cluster
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-19-2024 07:54 AM
My task is to export data from CSV/SQL into Excel format with minimal latency. To achieve this, I used a Serverless cluster.
Since PySpark does not support saving in XLSX format, it is necessary to install the Maven package spark-excel_2.12. However, Serverless clusters do not allow the installation of additional libraries as regular clusters do. Therefore, I attempted to install it using the REST API.
headers = {
'Authorization': f'Bearer {TOKEN}',
}
data = {
"cluster_id": CLUSTER_ID,
"libraries": [
{
"maven": {
"coordinates": "com.crealytics:spark-excel_2.13:3.4.1_0.19.0"
}
}
]
}
response = requests.post(f'{HOST}/api/2.0/libraries/install', headers=headers, json=data)
But when I try to save the file in Excel format, it returns an error
[DATA_SOURCE_NOT_FOUND] Failed to find the data source: com.crealytics.spark.excel. Make sure the provider name is correct and the package is properly registered and compatible with your Spark version. SQLSTATE: 42K02
How can this issue be resolved? Are there any other ways to export an Excel file ASAP without waiting for the cluster to start up?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-31-2024 12:30 PM
I have a similar issue: how to install maven package in the notebook when running with a serverless cluster?
I need to install com.crealytics:spark-excel_2.12:3.4.2_0.20.3 in the notebook like the way pypl libraries installed in the notebook. e.g. %pip install package_name for pypl libraries.
I don't want to use environment sidebar and dependencies. First of all, adding the maven package in dependencies did not work ( I am guessing because it's not Pypl library). Secondly, I will be running the notebook in a workflow via Git, and even if applying the library via dependencies tab worked, it would not know about it when running the notebook from Git, so would not work.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 weeks ago
I have the exact same question and have not found any way to do it
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
I also have this question and wondered what the options were / are

