Data Engineering

Forum Posts

Sorted by:

by Graham • New Contributor III

09-16-2022 10:41:51 AM

8627 Views
5 replies
3 kudos

"MERGE" always slower than "CREATE OR REPLACE"

OverviewTo update our Data Warehouse tables, we have tried two methods: "CREATE OR REPLACE" and "MERGE". With every query we've tried, "MERGE" is slower.My question is this: Has anyone successfully gotten a "MERGE" to perform faster than a "CREATE OR...

Data Engineering

8627 Views
5 replies
3 kudos

09-16-2022 10:41:51 AM

View Replies

Latest Reply

Manisha_Jena
Databricks Employee

11-02-2023 2:18:28 AM

3 kudos

Hi @Graham Can you please try Low Shuffle Merge [LSM] and see if it helps? LSM is a new MERGE algorithm that aims to maintain the existing data organization (including z-order clustering) for unmodified data, while simultaneously improving performan...

3 kudos

11-02-2023 2:18:28 AM

4 More Replies

by peterwishart • New Contributor III

09-19-2022 12:23:25 PM

5316 Views
4 replies
0 kudos

Resolved! Programmatically updating the “run_as_user_name” parameter for jobs

I am trying to write a process that will programmatically update the “run_as_user_name” parameter for all jobs in an Azure Databricks workspace, using powershell to interact with the Jobs API. I have been trying to do this with a test job without suc...

Data Engineering

5316 Views
4 replies
0 kudos

09-19-2022 12:23:25 PM

View Replies

Latest Reply

baubleglue
New Contributor II

10-04-2023 7:34:24 AM

0 kudos

Solution you've submitted is a solution for different topic (permission to run job, the job still runs as the user in run_as_user_name field). Here is an example of changing "run_as_user_name"Docs:https://docs.databricks.com/api/azure/workspace/job...

0 kudos

10-04-2023 7:34:24 AM

3 More Replies

by ckwan48 • New Contributor III

02-20-2023 7:00:46 PM

2820 Views
3 replies
1 kudos

Create a Dockerfile from Cluster

Is there a way to create a Dockerfile from Workspace A's cluster configurations and deploy that on a different different cluster in Workspace B?

Data Engineering

2820 Views
3 replies
1 kudos

02-20-2023 7:00:46 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-21-2023 11:17:34 PM

1 kudos

Hi @Kevin Kim Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we ...

1 kudos

04-21-2023 11:17:34 PM

2 More Replies

by thushar • Contributor

01-23-2023 12:41:21 AM

3341 Views
6 replies
0 kudos

GeneratedAlwaysAs' along with dataframe.write

Is it possible to use a calculated column (as like in the delta table using generatedAlwaysAs) definition while writing the data frame as a delta file like df.write.format("delta").Any options are there with the dataframe.write method to achieve this...

Data Engineering

3341 Views
6 replies
0 kudos

01-23-2023 12:41:21 AM

View Replies

Latest Reply

pvignesh92
Honored Contributor

03-09-2023 6:27:58 AM

0 kudos

Hi @Thushar R ,This option is not a part of Dataframe write API as GeneratedAlwaysAs feature is only applicable to Delta format and df.write is a common API to handle writes for all formats. If you to achieve this programmatically, you can still use...

0 kudos

03-09-2023 6:27:58 AM

5 More Replies

by Michal_L • New Contributor

12-08-2022 6:01:51 AM

2912 Views
1 replies
0 kudos

How can I create grouped bars that are also stacked visualization?

I wish to create a visualization combined of grouped bars, and also have those bars stacked.Attached is a sketch of the final result I am interested in.I am also attaching my sql because I'm not sure if I should "group by" in the query or in the visu...

Data Engineering

2912 Views
1 replies
0 kudos

12-08-2022 6:01:51 AM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-20-2022 6:19:00 AM

0 kudos

try to use asc or dsc keyword ,maybe it worked

0 kudos

12-20-2022 6:19:00 AM

by mickniz • Contributor

11-26-2022 5:49:25 AM

5032 Views
3 replies
6 kudos

Unable to create materialized view in Unity enabled Catalogues.

Hi Team,I was going through one of the videos of Databricks Sql Serverless and it say there is materialized view support . We can create materialized view .I tried same on my cluster of Sql Warehouse gives below error:

Data Engineering

5032 Views
3 replies
6 kudos

11-26-2022 5:49:25 AM

View Replies

Latest Reply

Felix
Databricks Employee

11-28-2022 2:35:52 AM

6 kudos

Materialized views is in private preview right now afaik. Please talk to your account or customer success team at Databricks in order to sign up and enable it for your workspace. Thanks!

6 kudos

11-28-2022 2:35:52 AM

2 More Replies

by yalun • New Contributor III

11-24-2022 7:10:22 AM

866 Views
0 replies
4 kudos

I cannot create a workspace, help me please.

They are grey I cannot click them. And if I hover my cursor on top of them, there is no any info.What am I gonna do?

Data Engineering

866 Views
0 replies
4 kudos

11-24-2022 7:10:22 AM

by JoeWMP • New Contributor III

11-11-2022 12:39:38 PM

1643 Views
1 replies
7 kudos

All-purpose compute clusters that are attached to a pool are no longer able to switch to a different pool/change to a non-pool worker/driver.

Would like to know if anyone else is experiencing this - we're seeing this across 5+ different Databricks workspaces in both AWS and Azure.Reproduction: Create all purpose compute cluster, attach it to existing pool, save and start cluster. Edit clus...

Data Engineering

1643 Views
1 replies
7 kudos

11-11-2022 12:39:38 PM

View Replies

Latest Reply

JoeWMP
New Contributor III

11-11-2022 12:52:16 PM

7 kudos

We're also seeing the same behavior when trying to change the pool on an all-purpose cluster using Terraform and Databricks Labs Terraform provider as well. The Terraform apply will go through and say the cluster was updated to the new pool id, but t...

7 kudos

11-11-2022 12:52:16 PM

by JakeP • New Contributor III

04-20-2022 11:19:05 AM

2399 Views
3 replies
1 kudos

Resolved! Is there a way to create a path under /Repos via API?

Trying to use Repos API to automate creation and updates to repos under paths not specific to a user, i.e. /Repos/Admin/<repo-name>. It seems that creating a repo via POST to /api/2.0/repos will fail if you don't include a path, and will also fail i...

Data Engineering

2399 Views
3 replies
1 kudos

04-20-2022 11:19:05 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

04-20-2022 11:23:32 AM

1 kudos

https://docs.databricks.com/dev-tools/api/latest/workspace.html#mkdirs try through Workspace API:curl --netrc --request POST \ https://dbc-a1b2345c-d6e7.cloud.databricks.com/api/2.0/workspace/mkdirs \ --header 'Accept: application/json' \ --dat...

1 kudos

04-20-2022 11:23:32 AM

2 More Replies

by SailajaB • Valued Contributor III

02-09-2022 10:39:24 PM

8984 Views
12 replies
4 kudos

Resolved! JSON validation is getting failed after writing Pyspark dataframe to json format

Hi We have to convert transformed dataframe to json format. So we used write and json format on top of final dataframe to convert it to json. But when we validating the output json its not in proper json format.Could you please provide your suggestio...

Data Engineering

8984 Views
12 replies
4 kudos

02-09-2022 10:39:24 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-02-2022 9:01:40 AM

4 kudos

@Sailaja B - Does @Aman Sehgal's most recent answer help solve the problem? If it does, would you be happy to mark their answer as best?

4 kudos

03-02-2022 9:01:40 AM

11 More Replies

by SailajaB • Valued Contributor III

01-19-2022 5:29:16 AM

4095 Views
4 replies
6 kudos

Resolved! how to create a nested(unflatten) json from flatten json

Hi ,Is there any function in pyspark which can convert flatten json to nested json.Ex : if we have attribute in flatten is like a_b_c : 23then in unflatten it should be{"a":{"b":{"c":23}}}Thank you

Data Engineering

4095 Views
4 replies
6 kudos

01-19-2022 5:29:16 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

01-20-2022 2:44:30 AM

6 kudos

As @Chuck Connell said can you share more of your source json as that example is not json. Additionally flatten is usually to change something like {"status": {"A": 1,"B": 2}} to {"status.A": 1, "status.B": 2} which can be done easily with spark da...

6 kudos

01-20-2022 2:44:30 AM

3 More Replies

by William_Scardua • Valued Contributor

11-29-2021 7:15:15 AM

10611 Views
6 replies
3 kudos

Resolved! How do you create a Sandbox in your data environment ?

Hi guys,How do you create a Sandbox in your data environment ? have any idea ?Azzure/AWS + Data Lake + Databricks

Data Engineering

10611 Views
6 replies
3 kudos

11-29-2021 7:15:15 AM

View Replies

Latest Reply

missyT
New Contributor III

11-29-2021 8:25:40 PM

3 kudos

In a sandbox environment, you will find the Designer enabled. You can activate Designer by selecting the design icon Designer. on a page, or by choosing the Design menu item in the Settings Settings menu.

3 kudos

11-29-2021 8:25:40 PM

5 More Replies

by ashu208 • New Contributor

10-14-2021 7:12:45 AM

1986 Views
4 replies
0 kudos

I am not able to create a cluster

Hi,I am new on the Databricks platform, few weeks before I created a community version and it was working perfectly till 2 days before, now I can not create a cluster anymore, after few minutes it time out whenever I am trying to create a new cluster...

Data Engineering

1986 Views
4 replies
0 kudos

10-14-2021 7:12:45 AM

View Replies

Latest Reply

Dileep_Vidyadar
New Contributor III

11-23-2021 9:47:50 AM

0 kudos

Hi @Ashwinkumar Jayakumar and @Prabakar Ammeappin , I am facing the same issue for 3-4 days.Is there something wrong with Community Edition right now or does my account facing some issues?

0 kudos

11-23-2021 9:47:50 AM

3 More Replies

by Nick_Hughes • New Contributor III

11-02-2021 1:34:12 AM

2259 Views
3 replies
3 kudos

Is there an alerting API please?

Is there an alerting api so that alerts can be source controlled and automated, please ?https://docs.databricks.com/sql/user/alerts/index.html

Data Engineering

2259 Views
3 replies
3 kudos

11-02-2021 1:34:12 AM

View Replies

Latest Reply

Dan_Z
Databricks Employee

11-02-2021 10:30:42 AM

3 kudos

Hello @Nick Hughes , as of today we do not expose or document the API for these features. I think it will be a useful feature so I created an internal feature request for it (DB-I-4289). If you (or any future readers) want more information on this f...

3 kudos

11-02-2021 10:30:42 AM

2 More Replies

by aimas • New Contributor III

10-12-2021 4:45:24 PM

8501 Views
8 replies
5 kudos

Resolved! error creating tables using UI

Hi, i try to create a table using UI, but i keep getting the error "error creating table <table name> create a cluster first" even when i have a cluster alread running. what is the problem?

Data Engineering

8501 Views
8 replies
5 kudos

10-12-2021 4:45:24 PM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

10-13-2021 2:13:29 AM

5 kudos

Be sure that cluster is selected (arrow in database) and at least there is Default database.

5 kudos

10-13-2021 2:13:29 AM

7 More Replies