Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
I have a Repo in Databricks connected to Azure DevOps Repositories.The repo has been working fine for almost a month, until last week. Now when I try to open the Git settings in Databricks, I am getting "Invalid Git Credentials". Nothing has change...
Hello Experts,We have data bricks on Azure. We need to provide a user interface to users so that some customizing table entries can be entered by end users which in turn are saved in the Delta table. Is there any feature in Databricks or tools that w...
I achieved this using textbox widget and asked user to enter and run dashboard, in python code validating and inserting data in databricks.2nd method , created excel vba application, using odbc connection to insert delete and update data in delta tab...
You can use Kusto Spark connector for that: https://github.com/Azure/azure-kusto-spark/blob/master/docs/KustoSource.md#source-read-command
It heavily depends on how you access data, there could be a need for using ADX cluster for it: https://learn.mi...
We are trying to connect Cognos 11.1.7 to Azure Databricks, but no success.Can you please help or guide us how to connect Cognos 11.1.7 to Azure Databricks.This is very critical to our user community. Can you please help or guide us how to connect Co...
I have a simple job scheduled every 5 min. Basically it listens to cloudfiles on storage account and writes them into delta table, extremely simple. The code is something like this:df = (spark
.readStream
.format("cloudFiles")
.option('cloudFil...
When I run a job enabling using spot instances , I would like to know how many number of workers are using spot and how many number of workers are using on demand instances for a given job run In order to identify the spot instances we got for any...
How to connect your Azure Data Lake Storage to Azure DatabricksStandard Workspace Private linkIn your storage accounts please go to “Networking” -> “Private endpoint connections” and click Add Private Endpoint.It is important to add private links in ...
Hi amazing community folks,Feel free to share your experience or knowledge regarding below questions:-1.) Can we pass a CTE sql statement into spark jdbc? i tried to do it i couldn't but i can pass normal sql (Select * from ) and it works. i heard th...
I have setup authentication using this page https://docs.databricks.com/sql/api/authentication.html and run curl -n -X GET https://<databricks-instance>.cloud.databricks.com/api/2.0/sql/history/queriesTo get history of all sql endpoint queries, but I...
Here's how to query with databricks-sdk-py (working code). I had a frustrating time doing it with vanilla python + requests/urllib and couldn't figure it out. import datetime
import os
from databricks.sdk import WorkspaceClient
from databricks.sdk.se...
the documentation states that "drop table":Deletes the table and removes the directory associated with the table from the file system if the table is not EXTERNAL table. An exception is thrown if the table does not exist.In case of an external table...
Hi,There is a way to force delete files after drop the table and don't wait 30 days to see size in S3 decrease?Tables that I dropped related to the dev and staging, I don't want to keep there files for 30 days
Hi all,I'm just reaching out to see if anyone has information or can point me in a useful direction. I need to connect to Snowflake from Azure Databricks using the connector: https://learn.microsoft.com/en-us/azure/databricks/external-data/snowflakeT...
we ended up using device flow oauth because, as noted above, it is not possible to launch a browser on the Databricks cluster from a notebook so you cannot use "externalBrowser" flow. It gives you a url and a code and you open the url in a new tab an...
Hello,Since a week ago, our notebook are stuck in running on the firsts cells which import python module from our github repository which is cloned in databricks.The cells stays in running state and when we try to manually cancel the jobs in databric...
I am currently using two streams to monitor data in two different containers on an Azure storage account. Is there any way to configure an autoloader to read from two different locations? The schemas of the files are identical.
@Morten Stakkeland :Yes, it's possible to configure an autoloader to read from multiple locations.You can define multiple CloudFiles sources for the autoloader, each pointing to a different container in the same storage account. In your case, since ...
I have created a workspace with private endpoint in Azure following this guide:https://learn.microsoft.com/en-us/azure/databricks/administration-guide/cloud-configurations/azure/private-linkOnce I have created the private link of type browser_authent...
You don't need a CNAME record.Go to your private link resource in Azure and click on Settings > DNS Configuration. Make sure you have created private link A records for all the FQDNs listed under 'Custom DNS records'. You have most likely missed one ...
We already know that we can mount Azure Data Lake Gen2 with OAuth2 using this:configs = {"fs.azure.account.auth.type": "OAuth",
"fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
...