cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

MauricioS
by New Contributor III
  • 979 Views
  • 1 replies
  • 0 kudos

Need advice for a big source table DLT Pipeline

Hi all,I was hoping to get advice from someone with DLT Pipelines, I want to apologize in advance if this is a noob question, I'm really new into DLT, materialized views and streaming tablesI have the following scenario, my source is a big sales delt...

  • 979 Views
  • 1 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Honored Contributor III
  • 0 kudos

Hi @MauricioS Absolutely not a noob question — you're touching on a common and important challenge in DLT pipelines,especially when dealing with large shared Delta tables and incremental ingestion from Unity Catalog sources.Let’s break it down so it’...

  • 0 kudos
Prashant777
by New Contributor II
  • 8314 Views
  • 6 replies
  • 0 kudos

Error in SQL statement: UnsupportedOperationException: Cannot perform Merge as multiple source rows matched and attempted to modify the same

My code:- CREATE OR REPLACE TEMPORARY VIEW preprocessed_source ASSELECT  Key_ID,  Distributor_ID,  Customer_ID,  Customer_Name,  ChannelFROM integr_masterdata.Customer_Master;-- Step 2: Perform the merge operation using the preprocessed source tableM...

  • 8314 Views
  • 6 replies
  • 0 kudos
Latest Reply
LokeshManne
New Contributor III
  • 0 kudos

This error occurs; when we try to update all the cells of target_data without a single updated record in source_data(updates_data) , to resolve this issue add a update_time column with unix timestamp (or) make changes in at least one cell of streamin...

  • 0 kudos
5 More Replies
Ru
by New Contributor III
  • 1290 Views
  • 2 replies
  • 2 kudos

Resolved! CDF metadata columns are lost after importing dlt

Hi Databricks Community, I attempted to read the Change Feed from a CDF-enabled table. Initially, the correct table schema, including the metadata columns (_change_type, _commit_version, and _commit_timestamp), was returned as expected. However, afte...

  • 1290 Views
  • 2 replies
  • 2 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 2 kudos

The issue stems from the interaction between the Change Data Feed (CDF) metadata columns (_change_type, _commit_version, _commit_timestamp) and the Delta Live Tables (DLT) library. After you import the dlt module, the behavior of reading the CDF-enab...

  • 2 kudos
1 More Replies
Dave1967
by New Contributor III
  • 2910 Views
  • 4 replies
  • 3 kudos

Resolved! Serverless Compute no support for Caching data frames

Can anyone please tell me why df.cache() and df.persist() are not supported in Serevrless compute?Many Thanks

  • 2910 Views
  • 4 replies
  • 3 kudos
Latest Reply
kunalmishra9
Contributor
  • 3 kudos

What I do wish was possible was for serverless to warn that caching is not supported, but not error on a call. It makes switching between compute (serverless & all purpose) brittle and prevents code from easily being interoperable, no matter the comp...

  • 3 kudos
3 More Replies
Kumaril_02
by New Contributor
  • 1756 Views
  • 1 replies
  • 0 kudos

Cannot Create Table under catalog.schema

AnalysisException: [RequestId=75cd00bc-7274-48c5-bdb2-c86a05de227f ErrorClass=TABLE_DOES_NOT_EXIST.RESOURCE_DOES_NOT_EXIST] Table '643a51ba-70c9-41ac-b75d-9c0f9039e7c1' does not exist. I am getting this issue while creating the table under the catalo...

  • 1756 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 0 kudos

Hi Kumaril_02,How are you doing today?, As per my understanding, It looks like the error you're getting—saying the table with a long ID doesn’t exist—is probably happening because Databricks is trying to reference a table using its internal ID rather...

  • 0 kudos
vanagnostopoulo
by New Contributor III
  • 844 Views
  • 2 replies
  • 0 kudos

If/else task branches

Hi,I have an If/else task, say A and two other tasks B and C.For the false outcome I would like to execute task B. For the true branch I would like to execute task C followed with task B. What is the correct way to express the dependencies of B on th...

  • 844 Views
  • 2 replies
  • 0 kudos
Latest Reply
vanagnostopoulo
New Contributor III
  • 0 kudos

For sure one solution is to package everything in a separate job. Other options?

  • 0 kudos
1 More Replies
Avinash_Narala
by Valued Contributor II
  • 1632 Views
  • 2 replies
  • 0 kudos

Serverless Cluster Issue

Hi,While using Serverless cluster I'm not able to access dbfs files, saying I don't have permission to the file.But while accessing them using All-purpose cluster I'm able to access them.Why am I facing this issue?

  • 1632 Views
  • 2 replies
  • 0 kudos
Latest Reply
RobertWhite
New Contributor II
  • 0 kudos

You might be encountering this issue due to permission differences between the serverless and all-purpose clusters. Serverless environments often have restricted access for enhanced security. Make sure the appropriate IAM roles or access controls are...

  • 0 kudos
1 More Replies
VanessaSousa_Ol
by New Contributor
  • 1332 Views
  • 1 replies
  • 0 kudos

RLS and CLS with delta sharing

RLS and CLS is possible to apply in tables that are shared using unity catalog?

  • 1332 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vidhi_Khaitan
Databricks Employee
  • 0 kudos

Hi Vanessa, RLS and CLS is not directly supported in delta sharing but as an alternative you could add dynamic views to a share to filter rows and columns. Please find the documentation below - https://docs.databricks.com/aws/en/delta-sharing/create-...

  • 0 kudos
hpant
by New Contributor III
  • 1007 Views
  • 1 replies
  • 0 kudos

Difference between creating a schema manually vs schema through SQL code externally?

I have created a bronze schema manually using catalog->create schema. I have provided external location. The "details" table look like this:However, when I created silver schema but this time using sql script i.e. %sqlCREATE SCHEMA xyz.silverMANAGED ...

hpant_0-1722503455026.png hpant_1-1722503814577.png
  • 1007 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vidhi_Khaitan
Databricks Employee
  • 0 kudos

When a schema is created manually via the Databricks catalog UI, ownership defaults to the user who created the schema, and ownership properties may not be explicitly recorded by default.On the other hand, when creating a schema using SQL commands, D...

  • 0 kudos
Lloydy
by New Contributor
  • 3582 Views
  • 1 replies
  • 0 kudos

What is the possible cause in calling the dataricks Job permission API

 PATCH /api/2.0/permissions/jobs/{job_id}  {    "error_code": "INVALID_PARAMETER_VALUE",    "message": "Owner permissions cannot be modified via an update / PATCH request if the endpoint does not have a valid owner. Please use a set / PUT request ins...

  • 3582 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vidhi_Khaitan
Databricks Employee
  • 0 kudos

Hi team,This seems to be an expected behaviour.The PATCH endpoint for updating job permissions is designed for incremental modifications of existing permissions. However, modifying owner permissions is restricted unless a valid owner already exists i...

  • 0 kudos
Arby
by New Contributor II
  • 17908 Views
  • 5 replies
  • 0 kudos

Help With OSError: [Errno 95] Operation not supported: '/Workspace/Repos/Connectors....

Hello,I am experiencing issues with importing from utils repo the schema file I created.this is the logic we use for all ingestion and all other schemas live in this repo utills/schemasI am unable to access the file I created for a new ingestion pipe...

icon
  • 17908 Views
  • 5 replies
  • 0 kudos
Latest Reply
HarikaM
New Contributor II
  • 0 kudos

@ArbMake sure the below is a file with extension .py and not a notebook. That should resolve the issue./Workspace/Repos/Connectors/Dev/utils/schemas/Comptroller.py'

  • 0 kudos
4 More Replies
Brad
by Contributor II
  • 3726 Views
  • 1 replies
  • 0 kudos

How to specify init file path

Hi team,I want to create a job and install some libs to job cluster.1. For job cluster, understand we can specify libraries under task, but if I want to install the lib to the whole cluster seems the only way is to use init script right?2. In my env,...

  • 3726 Views
  • 1 replies
  • 0 kudos
Latest Reply
kamal_ch
Databricks Employee
  • 0 kudos

There is no direct method to dynamically specify the user ID in this path across environments.  If dynamic determination of the user ID per environment is essential, you may need additional scripting or automation to set the correct path before execu...

  • 0 kudos
Nick_Pacey
by New Contributor III
  • 3666 Views
  • 1 replies
  • 1 kudos

Catalog Volume Preview for PDF files

Hi,Does anyone know if and when Databricks will enable the Preview option to work for PDF files stored in a Catalog Volume?Thanks!Nick

  • 3666 Views
  • 1 replies
  • 1 kudos
Latest Reply
kamal_ch
Databricks Employee
  • 1 kudos

this capability has not been explicitly mentioned or confirmed in the available documentation. Further clarification or inquiries with Databricks product teams may be useful to confirm if such functionality is planned for future releases.

  • 1 kudos
NehaR
by New Contributor III
  • 3827 Views
  • 1 replies
  • 0 kudos

Access control on view

 We've created a view containing PII mapping values in Unity Catalog. We need to grant other users the ability to query this view and access the mapping values, but we must prevent them from seeing the view's definition. Is it possible to grant "exec...

  • 3827 Views
  • 1 replies
  • 0 kudos
Latest Reply
kamal_ch
Databricks Employee
  • 0 kudos

In Unity Catalog, enabling users to query a view while hiding its definition is currently not supported directly. Unity Catalog requires view definitions to be visible during query execution for metadata access purposes. While querying, users typical...

  • 0 kudos
KristiLogos
by Contributor
  • 4149 Views
  • 1 replies
  • 0 kudos

Making dynamic tasks like in Airflow, but in Databricks?

I've used Airflow which allows us to create a DAG with dynamic tasks, for example we can have a list of items (such as table names),  loop through an operator that accepts a table name and create a task for each table without having to create a new n...

  • 4149 Views
  • 1 replies
  • 0 kudos
Latest Reply
kamal_ch
Databricks Employee
  • 0 kudos

Yes, it is possible to create dynamic tasks in Databricks workflows, similar to the approach using Apache Airflow, by leveraging Databricks' job orchestration capabilities. However, the implementation may differ from Airflow's dynamic DAG creation. D...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels