Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
I have a job running multiple tasks :Task 1 runs a machine learning pipeline from git repo 1Task 2 runs an ETL pipeline from git repo 1Task 2 is actually a generic pipeline and should not be checked in repo 1, and will be made available in another re...
The way to go about this would be to create Databricks repos in the workspace and then use that in the task formation. This way we can refer multiple repos in different tasks.
I'm using Databricks' support for GitHub repos. When I switch from one branch to another while a notebook is open, it messes up my notebook. Specifically, every notebook cell appears twice after switching branches.
I'm looking to automate the creation of Top Level repositories in Databricks however isn't possible using cli or API if this repo is private repository(Azure DevOps Repository) because require setup the token in user setting.databricks repos create \...
We have API availabe for repos https://docs.databricks.com/dev-tools/api/latest/repos.html#operation/get-repos also we are currently supporting SP. Step 1: As an admin Create a Service PrincipalUse this API SCIM API 2.0 (ServicePrincipals) | Databric...
Using VS code for development and a wheel package is created for shipment.We put this wheel package in Azure data lake storage and ADB notebook accessed this wheel package and installed it in the cluster. It is working fine. But instead of keeping th...