- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-10-2025 06:33 AM
This might be a stupid question but there's just no mention of what to do here. I'm looking at the blog (https://www.databricks.com/blog/simplify-data-ingestion-new-python-data-source-api) and documentation (https://learn.microsoft.com/en-us/azure/databricks/pyspark/datasources) for the Python Data Source API, and I don't see how to deploy the custom library. Do we need to create a wheel file and upload it? Do we use regular .py files in our workspace and %run them? Any guidance would be appreciated.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-10-2025 06:51 AM
Hi @Rjdudley,
Thanks for your question - You can create regular .py files in your workspace and use the %run magic command to include them in your notebooks. This method is straightforward and good for development and testing.
%run /path/to/your/custom_datasource_file
For a more production-ready approach, you can create a wheel file of your custom data source implementation and upload it to your cluster or workspace. This method is preferred for sharing across multiple notebooks or jobs
•Package your code into a wheel file
•Upload the wheel file to your Databricks workspace or a accessible location (e.g., DBFS)
•Install the wheel file on your cluster using init scripts or pip install commands
You can also package your custom data source as a library and install it directly on your cluster
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-10-2025 06:51 AM
Hi @Rjdudley,
Thanks for your question - You can create regular .py files in your workspace and use the %run magic command to include them in your notebooks. This method is straightforward and good for development and testing.
%run /path/to/your/custom_datasource_file
For a more production-ready approach, you can create a wheel file of your custom data source implementation and upload it to your cluster or workspace. This method is preferred for sharing across multiple notebooks or jobs
•Package your code into a wheel file
•Upload the wheel file to your Databricks workspace or a accessible location (e.g., DBFS)
•Install the wheel file on your cluster using init scripts or pip install commands
You can also package your custom data source as a library and install it directly on your cluster
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-10-2025 10:46 AM
@Alberto_Umana Brilliant, thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-10-2025 10:52 AM
You're very welcome!

