cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Zachary_Higgins
by Contributor
  • 15945 Views
  • 9 replies
  • 13 kudos

ignoreDeletes' option with Delta Live Table streaming source

We have a delta streaming source in our delta live table pipelines that may have data deleted from time to time. The error message is pretty self explanatory:...from streaming source at version 191. This is currently not supported. If you'd like to i...

  • 15945 Views
  • 9 replies
  • 13 kudos
Latest Reply
IanB_Argento
New Contributor II
  • 13 kudos

I had this same issue whilst doing some POC work. I was able to overcome it as follows:Navigate to Workflows | Jobs & pipelines.Select your pipeline.Click the drop-down next to the Start button.Choose "Full refresh all".That resets it all and fixes t...

  • 13 kudos
8 More Replies
Pavankumar7
by New Contributor III
  • 3805 Views
  • 6 replies
  • 4 kudos

Resolved! Error in connecting serverless compute in free edition

I am unable to connect serverless compute under Free edition of DB, also in compute tab, I can see only the 3 tabs (SQL warehouses, Vector search, apps) not able to create new compute as we used to create in community edition  

Pavankumar7_0-1750675424179.png Pavankumar7_1-1750675457641.png
  • 3805 Views
  • 6 replies
  • 4 kudos
Latest Reply
Thomas_W
New Contributor III
  • 4 kudos

@Pavankumar7 - are you experiencing this issue for existing/imported notebooks, or for brand new notebooks too?If it's the former, the notebook may be using an old serverless environment version. When Databricks updates the Serverless environment, ex...

  • 4 kudos
5 More Replies
pacman
by New Contributor
  • 18898 Views
  • 7 replies
  • 0 kudos

How to run a saved query from a Notebook (PySpark)

Hi Team! Noob to Databricks, so apologies if I ask a dumb question.I have created a relatively large series of queries that fetch and organize the data I want.  I'm ready to drive all of these from a Notebook (likely PySpark).An example query is save...

  • 18898 Views
  • 7 replies
  • 0 kudos
Latest Reply
aethorimn_cgr
New Contributor II
  • 0 kudos

@uday_satapathy Hi Uday. Do you know if this method works for many users? In case I need to share the script so a teammate may use it.

  • 0 kudos
6 More Replies
Pratikmsbsvm
by Contributor
  • 2339 Views
  • 2 replies
  • 2 kudos

Resolved! Data Lakehouse architecture with Azure Databricks and Unity Catalog

I am Creating a Data lakehouse solution on Azure Databricks.Source : SAP, SALESFORCE, AdobeTarget: Hightouch (External Application), Mad Mobile (External Application)The datalake house also have transactional records which should be store in ACID pro...

  • 2339 Views
  • 2 replies
  • 2 kudos
Latest Reply
KaranamS
Contributor III
  • 2 kudos

Hi @Pratikmsbsvm , from what I understand, you have a lakehouse on Azure databricks and would like to share this data with another databricks account or workspace. If Unity Catalog is enabled on your Azure databricks account, you can leverage Delta S...

  • 2 kudos
1 More Replies
data_learner1
by New Contributor II
  • 1568 Views
  • 4 replies
  • 1 kudos

Need to track the schema changes/column renames/column drops in Data bricks Unity Catalog

Hi Team, We are getting data from third party vendor to the databricks unity Catalog. They are doing schema changes frequently and we would like to track that. Just wanted to know if I can do this using audit table on the system catalog. As we only h...

  • 1568 Views
  • 4 replies
  • 1 kudos
Latest Reply
CURIOUS_DE
Valued Contributor
  • 1 kudos

@data_learner1  Unity Catalog logs all data access and metadata operations (including schema changes) into the audit logs — which are stored in the system catalog tables, such as:system.access.auditYou mentioned you only have read access — and likely...

  • 1 kudos
3 More Replies
NikosLoutas
by Databricks Partner
  • 2928 Views
  • 2 replies
  • 0 kudos

Resolved! Databricks Full Refresh of DLT Pipeline

Hello, I have a question regarding the full refresh of a DLT pipeline, where the data source is an external table. When running the pipeline without a full refresh, then the streaming will pull data which are currently present in the external source ...

  • 2928 Views
  • 2 replies
  • 0 kudos
Latest Reply
seeyesbee
New Contributor II
  • 0 kudos

Hi @paolajara — in your point 5 you mentioned using Delta Lake for tracking changes. Could you point me to any official docs or examples that walk through enabling CDC / row-tracking on a Delta table?I pull data from SharePoint via its REST endpoint,...

  • 0 kudos
1 More Replies
Pratikmsbsvm
by Contributor
  • 1979 Views
  • 2 replies
  • 0 kudos

How to build architecture for Batch as well Stream Data Pipeline in Databricks

Hello,I am planning to Create a Data Lake house using Azure and Databricks.Earlier i planned to do with Azure, but use cases looks complex.Can someone please help me with suggestions.Source System : SAP, SALESFORCE, SAP CAR, Adobe Clickstream.Consume...

  • 1979 Views
  • 2 replies
  • 0 kudos
Latest Reply
SP_6721
Honored Contributor II
  • 0 kudos

Hi @Pratikmsbsvm ,The appropriate approach would be:Data Ingestion:Ingest data from SAP, SAP CAR, and Salesforce using Azure Data Factory or third-party connectors. For near real-time updates, enable CDC-based ingestion.Data Lakehouse Storage:Store a...

  • 0 kudos
1 More Replies
guizsantos
by New Contributor II
  • 4639 Views
  • 3 replies
  • 3 kudos

Resolved! How to obtain a query profile programatically?

Hi everyone! Does anyone know if there is a way to obtain the data used to create the graph showed in the "Query profile" section? Particularly, I am interested in the rows produced by the intermediary query operations. I can see there is "Download" ...

  • 4639 Views
  • 3 replies
  • 3 kudos
Latest Reply
artsheiko
Databricks Employee
  • 3 kudos

@guizsantos,  Query history list api provides metrics, see include_metrics  an executed query definition may be seen using query history system table 

  • 3 kudos
2 More Replies
seefoods
by Valued Contributor
  • 1851 Views
  • 1 replies
  • 1 kudos

Resolved! python task

Hello Guys,I have define asset bundle which have rule to run a python task. This task have some parameters, So how can i interract with this using argparse ? Cordially, 

  • 1851 Views
  • 1 replies
  • 1 kudos
Latest Reply
SP_6721
Honored Contributor II
  • 1 kudos

Hi @seefoods ,In your asset bundle YAML, define the parameters using the named_parameters field, for example like this:tasks:  - task_key: python_task    python_wheel_task:      entry_point: main      named_parameters:        input_path: "/data/input...

  • 1 kudos
mkwparth
by Databricks Partner
  • 2322 Views
  • 4 replies
  • 1 kudos

Spark Failed to start: Driver unresponsive

Hi everyone,I'm encountering an intermittent issue when launching a Databricks pipeline cluster. Error messagecom.databricks.pipelines.common.errors.deployment.DeploymentException: Failed to launch pipeline cluster xxxx-xxxxxx-ofgxxxxx: Attempt to la...

  • 2322 Views
  • 4 replies
  • 1 kudos
Latest Reply
Gopichand_G
Databricks Partner
  • 1 kudos

I have personally witnessed these kind of issues. Why these failures happen, usually as far as I have witnessed that the Driver Node might be unavailable or not responsive as you might have hit the maximum cpu or memory usage, may be your cache utili...

  • 1 kudos
3 More Replies
skooijman
by New Contributor II
  • 3820 Views
  • 4 replies
  • 7 kudos

dbt_project.yml won't load in databricks dbt job

We're running into issues with dbt jobs, which are not running anymore. The errors we receive suggest that the dbt_project.yml file cannot be found, while the profiles.yml can be found. We are running our dbt jobs with Databricks Workflows. We've tri...

  • 3820 Views
  • 4 replies
  • 7 kudos
Latest Reply
LokmenChouaya
New Contributor II
  • 7 kudos

Hello is there any updates please regarding the issue? I'm having the same problem on my prod 

  • 7 kudos
3 More Replies
Phani1
by Databricks MVP
  • 4309 Views
  • 1 replies
  • 0 kudos

Databricks AI (LLM) Functionalities: Data Privacy and Security

Hi Databricks Team,When leveraging Databricks' AI (LLM) functionalities, such as ai_query and ai_assistant, how does Databricks safeguard customer data and ensure privacy, safety, and security?Regards,Phani

  • 4309 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vinay_M_R
Databricks Employee
  • 0 kudos

Hello @Phani1, Databricks employs a multi-layered security approach to protect customer data when using AI functionalities like ai_query and Databricks Assistant. I am sharing below official documentation for your reference:https://learn.microsoft.co...

  • 0 kudos
Marvin_T
by Databricks Partner
  • 22805 Views
  • 3 replies
  • 2 kudos

Resolved! Disabling query caching for SQL Warehouse

Hello everybody,I am currently trying to run some performance tests on queries in Databricks on Azure. For my tests, I am using a Classic SQL Warehouse in the SQL Editor. I have created two views that contain the same data but have different structur...

  • 22805 Views
  • 3 replies
  • 2 kudos
Latest Reply
Marvin_T
Databricks Partner
  • 2 kudos

They are probably executing the same query plan now that you say it. And yes, restarting the warehouse does theoretically works but it isnt a nice solution.I guess I will do some restarting and build averages to have a good comparison for now

  • 2 kudos
2 More Replies
KristiLogos
by Contributor
  • 1934 Views
  • 2 replies
  • 0 kudos

Netsuite error - The driver could not open a JDBC connection. Check the URL

I'm trying to connect to Netsuite2 with the JDBC driver I added to my cluster. I'm testing this in my Sandbox Netsuite and I have the below code but it keeps saying:requirement failed: The driver could not open a JDBC connection. Check the URL: jdbc:...

  • 1934 Views
  • 2 replies
  • 0 kudos
Latest Reply
TheOC
Databricks Partner
  • 0 kudos

Hey @KristiLogos I had a little search online and found this which may be useful:https://stackoverflow.com/questions/79236996/pyspark-jdbc-connection-to-netsuite2-com-fails-with-failed-to-login-using-tbain short it seems that a token based connection...

  • 0 kudos
1 More Replies
seapen
by New Contributor II
  • 1456 Views
  • 1 replies
  • 0 kudos

[Question]: Get permissions for a schema containing backticks via the API

I am unsure if this is specific to the Java SDK, but i am having issues checking effective permissions on the following schema: databricks_dev.test_schema`In Scala i have the following example test: test("attempting to access schema with backtick") ...

  • 1456 Views
  • 1 replies
  • 0 kudos
Latest Reply
seapen
New Contributor II
  • 0 kudos

Update:Interestingly, if i URL encode _twice_ it appears to work, eg: test("attempting to access schema with backtick") { val client = new WorkspaceClient() client.config().setHost("redacted").setToken("redacted") val name = "databricks...

  • 0 kudos
Labels