Data Engineering

Forum Posts

Sorted by:

by brendanc19 • New Contributor III

03-07-2023 6:51:38 AM

4540 Views
6 replies
2 kudos

Resolved! Does cancelling a job run rollback any actions performed by query plan?

If I were to stop a rather large job run, say half way thru execution, will any actions performed on our Delta tables persist or will they be rolled back?Are there any other risks that I need to be aware of in terms of cancelling a job run half way t...

Data Engineering

4540 Views
6 replies
2 kudos

03-07-2023 6:51:38 AM

View Replies

Latest Reply

fabian_r
New Contributor II

12-03-2024 5:26:59 AM

2 kudos

Hi, is there any way to ensure transaction control in delta protocol in 2024 across tables for failing jobs?

2 kudos

12-03-2024 5:26:59 AM

5 More Replies

by techg • New Contributor II

12-02-2024 2:22:28 AM

578 Views
4 replies
1 kudos

Missing selection for Parameter error

Hi All,I have created three parameters in an SQL query in Databricks. If no value is entered for a parameter, I would like the query to retrieve all values for that particular column. Currently, I'm getting an error message: "Missing selection for Pa...

Data Engineering

578 Views
4 replies
1 kudos

12-02-2024 2:22:28 AM

View Replies

Latest Reply

techg
New Contributor II

12-03-2024 4:39:47 AM

1 kudos

I'm creating this query with parameters in SQL Editor in Databricks and added it to the SQL Dashboard.Do we need to create Widget while creating parameters in SQL Editor? When i tried creating widget in SQL editor, Im getting syntax error near Widget...

1 kudos

12-03-2024 4:39:47 AM

3 More Replies

by Gusman • New Contributor II

12-03-2024 3:10:58 AM

411 Views
2 replies
1 kudos

Resolved! Natural language queries through REST API?

Natural language queries provided by Genie are really powerful and a compeling tool.Is there any way to execute these natural language queries through the REST API to integrate them into in-house applications?

Data Engineering

411 Views
2 replies
1 kudos

12-03-2024 3:10:58 AM

View Replies

Latest Reply

stacey45
New Contributor II

12-03-2024 3:29:11 AM

1 kudos

@Gusman wrote:Natural language queries provided by Genie are really powerful and a compeling tool.Is there any way to execute these natural language queries through the REST API to integrate them into in-house applications?While there's no direct RES...

1 kudos

12-03-2024 3:29:11 AM

1 More Replies

by Clara • New Contributor

12-03-2024 1:30:22 AM

221 Views
1 replies
1 kudos

Retrieve data older than the one year window : system.access.table_lineage

Hello,I am currently using table_lineage from system.access.table_lineage. It is a great feature but I am experiencing missing data. After some search I have seen that "Because lineage is computed on a one-year rolling window, lineage collected more ...

Data Engineering

221 Views
1 replies
1 kudos

12-03-2024 1:30:22 AM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

12-03-2024 2:19:53 AM

1 kudos

Hi @Clara ,I don't think so. But you can build history such history tables by yourself. Desing ETL process that will extract data from system tables and store them in your own data tables.

1 kudos

12-03-2024 2:19:53 AM

by sboxi • New Contributor II

11-29-2024 11:44:09 AM

313 Views
2 replies
1 kudos

Can we create Materialized view n exsting view and table?

Dear All,Is it possible to create Materialized view on view and table (Joining view and table)?I suspect it is not possible. Please suggest.Also please provide best way to schedule the refresh of Materialized view. Regards,Surya

Data Engineering

313 Views
2 replies
1 kudos

11-29-2024 11:44:09 AM

View Replies

Latest Reply

sboxi
New Contributor II

12-03-2024 1:41:02 AM

1 kudos

Thanks @Alberto_Umana . I will try that.

1 kudos

12-03-2024 1:41:02 AM

1 More Replies

by TimW • New Contributor

11-12-2023 11:11:40 AM

4537 Views
4 replies
1 kudos

Resolved! Help - Can't create table from tutorial. Is my setup wrong?

Trying out databricks for the first time and followed the Get Started steps. I managed to successfully create a cluster and ran the simple sql tutorial to query data from a notebook. However, got the following error:Query:DROP TABLE IF EXISTS diamond...

Data Engineering

4537 Views
4 replies
1 kudos

11-12-2023 11:11:40 AM

View Replies

Latest Reply

patwilliams
New Contributor III

12-03-2024 12:46:38 AM

1 kudos

It seems as though you're doing great with your Databricks arrangement, however this sort of mistake could be connected with a couple of expected issues. In light of the subtleties you've shared, here are a few things you should check:Group Setup: Gu...

1 kudos

12-03-2024 12:46:38 AM

3 More Replies

by ashraf1395 • Honored Contributor

11-29-2024 6:57:02 AM

654 Views
5 replies
1 kudos

Empty Streaming tables in dlt

I want to create empty streaming tables in dlt with only schema specified. Is it possible ?I want to do it in dlt python.

Data Engineering

654 Views
5 replies
1 kudos

11-29-2024 6:57:02 AM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

12-02-2024 5:16:20 AM

1 kudos

Hi @ashraf1395, The term "rate" refers to a special source in Apache Spark's Structured Streaming that generates data at a specified rate. This source is primarily used for testing and benchmarking purposes. When you use spark.readStream.format("rate...

1 kudos

12-02-2024 5:16:20 AM

4 More Replies

by ashraf1395 • Honored Contributor

12-02-2024 4:02:28 AM

770 Views
2 replies
2 kudos

Resolved! applying column tags

Can anyone tell me the correct syntax for applying a column tag to a specific tableThese are what I tried ALTER TABLE accounts_and_customer.bronze.BB1123_loans ALTER/CHANGE COLUMN loan_number SET TAGS ('classification' = 'confidential')<p>I got thi...

Data Engineering

770 Views
2 replies
2 kudos

12-02-2024 4:02:28 AM

View Replies

Latest Reply

ashraf1395
Honored Contributor

12-03-2024 12:02:12 AM

2 kudos

Hi there @Takuya-Omi ,I agree. The syntax was correct. I was facing some completely different problems with schemas and I solved it. Thanks though. Or I would have spent hours banging my head to find the reason for error.

2 kudos

12-03-2024 12:02:12 AM

1 More Replies

by JothyGanesan • New Contributor III

12-02-2024 10:57:04 PM

376 Views
1 replies
1 kudos

CDF table partition - Real time Data

Hi team,We are currently working in loading CDF table using data events from Kafka. The table is going to hold data across geographies. When we tried partitioning it is slowing down the ingestion time. But without partition the downstream application...

Data Engineering

376 Views
1 replies
1 kudos

12-02-2024 10:57:04 PM

View Replies

Latest Reply

ozaaditya
Contributor

12-02-2024 11:44:59 PM

1 kudos

1. Instead of using many small partitions (e.g., country or region), opt for larger partitions, such as continent or time-based partitions (e.g., weekly or monthly). This will reduce the number of partitions and improve performance.2. Write data to ...

1 kudos

12-02-2024 11:44:59 PM

by binsel • New Contributor III

12-01-2024 10:06:26 PM

634 Views
3 replies
2 kudos

Resolved! UNPIVOT VARIANT data in SQL

Hi All,Have a VARIANT column with the following data;CREATE TABLE unpivot_valtype AS SELECT parse_json( '{ "Id": 1234567, "Result": { "BodyType": "NG", "ProdType": "Auto", "ResultSets": [ { "R1": { "AIn...

Data Engineering

634 Views
3 replies
2 kudos

12-01-2024 10:06:26 PM

View Replies

Latest Reply

filipniziol
Esteemed Contributor

12-02-2024 12:45:27 AM

2 kudos

Hi @binsel ,You need to use variant_explode function.Here is the working code:WITH first_explode AS ( SELECT uv.rowData:Id AS Id, uv.rowData:Result:BodyType AS BodyType, uv.rowData:Result:ProdType AS ProdType, v.value AS result_se...

2 kudos

12-02-2024 12:45:27 AM

2 More Replies

by schluca • New Contributor II

12-02-2024 1:43:24 AM

275 Views
1 replies
0 kudos

Error Querying Shallow Clones: Couldn't Initialize File System for Path

Hi,We are offering data products through a central catalog for our users. To minimize data duplication and to display relationships between tables, we use shallow clones to provide access to the data.However, since implementing this approach, we occa...

Data Engineering

275 Views
1 replies
0 kudos

12-02-2024 1:43:24 AM

View Replies

Latest Reply

Takuya-Omi
Valued Contributor III

12-02-2024 5:46:11 PM

0 kudos

Hi @schluca ,I’ve encountered an issue where an error occurred when trying to reference a table after deleting and recreating the source table for a Shallow Clone, and then performing the Shallow Clone again. As a solution, try deleting the destinati...

0 kudos

12-02-2024 5:46:11 PM

by Rahman823 • New Contributor II

11-13-2024 7:39:44 AM

519 Views
2 replies
1 kudos

Databricks table lineage

Hi,I wanted to know if it is possible to edit the lineage that we see in databricks, like the one shown below.Can I edit this lineage graph, like add other ETL tools (at the start of the tables) that I have used to get data in aws and then in databri...

Data Engineering

519 Views
2 replies
1 kudos

11-13-2024 7:39:44 AM

View Replies

Latest Reply

chm_user_1
New Contributor II

12-02-2024 1:48:47 PM

1 kudos

This will be extremely beneficial. We have certain use cases where we do not leverage Spark in our pipelines and lose the lineage. I would prefer to set an extra parameter when writing a table to specify the lineage.

1 kudos

12-02-2024 1:48:47 PM

1 More Replies

by vinitkhandelwal • New Contributor III

12-01-2024 6:33:10 PM

295 Views
2 replies
0 kudos

Error while running a notebook job using git repo (Gitlab)

I am trying to run a notebook job with a git repo hosted on Gitlab. I have Linked my Gitlab account using Gitlab tokenYet i am getting the following error on running the job How to resolve this?

Data Engineering

295 Views
2 replies
0 kudos

12-01-2024 6:33:10 PM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

12-02-2024 5:35:57 AM

0 kudos

Hi @vinitkhandelwal, Looks like the token could be missing required permissions for the operation. Please refer to: You can clone public remote repositories without Git credentials (a personal access token and a username). To modify a public remote r...

0 kudos

12-02-2024 5:35:57 AM

1 More Replies

by sharukh_lodhi • New Contributor III

08-19-2024 1:58:17 AM

1769 Views
4 replies
3 kudos

Azure IMDS is not accesbile selecting shared compute policy

Hi, Databricks community,I recently encountered an issue while using the 'azure.identity' Python library on a cluster set to the personal compute policy in Databricks. In this case, Databricks successfully returns the Azure Databricks managed user id...

Data Engineering

azure IMDS

DefaultAzureCredential

1769 Views
4 replies
3 kudos

08-19-2024 1:58:17 AM

View Replies

Latest Reply

daisy08
New Contributor II

12-02-2024 7:02:25 AM

3 kudos

I'm having a similar problem, my aim is from an Azure DataBricks notebook to invoke an AzureDataDactory pipeline I created an Access Connector for Azure Databricks to which I gave Data Factory Contributor permissions. Using these lines pythonfrom azu...

3 kudos

12-02-2024 7:02:25 AM

3 More Replies

by jeremy98 • Contributor III

12-02-2024 3:59:16 AM

605 Views
1 replies
0 kudos

Resolved! how to pass using DABs the parameters to use

Hello Community, I want to pass parameters to my Databricks job through the DABs CLI. Specifically, I'd like to be able to run a job with parameters directly using the command: databricks bundle run -t prod --params [for example: table_name="client"...

Data Engineering

605 Views
1 replies
0 kudos

12-02-2024 3:59:16 AM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

12-02-2024 7:00:45 AM

0 kudos

Hi @jeremy98 ,You can pass parameters using CLI in following way: databricks bundle run -t ENV --params Param1=Value1,Param2=Value2 Job_Name And in your yml file you should define parameters in similar way to following:You can find more info in follo...

0 kudos

12-02-2024 7:00:45 AM

User

Count

1611

768

345

286

252

Databricks Community

Forum Posts

Resolved! Does cancelling a job run rollback any actions performed by query plan?

Missing selection for Parameter error

Resolved! Natural language queries through REST API?

Retrieve data older than the one year window : system.access.table_lineage

Can we create Materialized view n exsting view and table?

Resolved! Help - Can't create table from tutorial. Is my setup wrong?

Empty Streaming tables in dlt

Resolved! applying column tags

CDF table partition - Real time Data

Resolved! UNPIVOT VARIANT data in SQL

Error Querying Shallow Clones: Couldn't Initialize File System for Path

Databricks table lineage

Error while running a notebook job using git repo (Gitlab)

Azure IMDS is not accesbile selecting shared compute policy

Resolved! how to pass using DABs the parameters to use

Join Us as a Local Community Builder!

toml file syntax highlighting

Materialized Views Compute

Sending customized mail with databricks notebook w...

Error Databricks Bundle Deploy with changes in the...

OPTIMIZE command on heavily nested table OOM error