cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Graham
by New Contributor III
  • 7188 Views
  • 5 replies
  • 2 kudos

"MERGE" always slower than "CREATE OR REPLACE"

OverviewTo update our Data Warehouse tables, we have tried two methods: "CREATE OR REPLACE" and "MERGE". With every query we've tried, "MERGE" is slower.My question is this: Has anyone successfully gotten a "MERGE" to perform faster than a "CREATE OR...

  • 7188 Views
  • 5 replies
  • 2 kudos
Latest Reply
Manisha_Jena
Databricks Employee
  • 2 kudos

Hi @Graham Can you please try Low Shuffle Merge [LSM]  and see if it helps? LSM is a new MERGE algorithm that aims to maintain the existing data organization (including z-order clustering) for unmodified data, while simultaneously improving performan...

  • 2 kudos
4 More Replies
peterwishart
by New Contributor III
  • 4400 Views
  • 4 replies
  • 0 kudos

Resolved! Programmatically updating the “run_as_user_name” parameter for jobs

I am trying to write a process that will programmatically update the “run_as_user_name” parameter for all jobs in an Azure Databricks workspace, using powershell to interact with the Jobs API. I have been trying to do this with a test job without suc...

  • 4400 Views
  • 4 replies
  • 0 kudos
Latest Reply
baubleglue
New Contributor II
  • 0 kudos

  Solution you've submitted is a solution for different topic (permission to run job, the job still runs as the user in run_as_user_name field). Here is an example of changing "run_as_user_name"Docs:https://docs.databricks.com/api/azure/workspace/job...

  • 0 kudos
3 More Replies
ckwan48
by New Contributor III
  • 2600 Views
  • 3 replies
  • 1 kudos

Create a Dockerfile from Cluster

Is there a way to create a Dockerfile from Workspace A's cluster configurations and deploy that on a different different cluster in Workspace B?

  • 2600 Views
  • 3 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Kevin Kim​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we ...

  • 1 kudos
2 More Replies
thushar
by Contributor
  • 2635 Views
  • 6 replies
  • 0 kudos

GeneratedAlwaysAs' along with dataframe.write

Is it possible to use a calculated column (as like in the delta table using generatedAlwaysAs) definition while writing the data frame as a delta file like df.write.format("delta").Any options are there with the dataframe.write method to achieve this...

  • 2635 Views
  • 6 replies
  • 0 kudos
Latest Reply
pvignesh92
Honored Contributor
  • 0 kudos

Hi @Thushar R​ ,This option is not a part of Dataframe write API as GeneratedAlwaysAs feature is only applicable to Delta format and df.write is a common API to handle writes for all formats. If you to achieve this programmatically, you can still use...

  • 0 kudos
5 More Replies
Michal_L
by New Contributor
  • 2339 Views
  • 1 replies
  • 0 kudos

How can I create grouped bars that are also stacked visualization?

I wish to create a visualization combined of grouped bars, and also have those bars stacked.Attached is a sketch of the final result I am interested in.I am also attaching my sql because I'm not sure if I should "group by" in the query or in the visu...

image Screenshot_20221208_040021
  • 2339 Views
  • 1 replies
  • 0 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 0 kudos

try to use asc or dsc keyword ,maybe it worked

  • 0 kudos
mickniz
by Contributor
  • 4484 Views
  • 3 replies
  • 6 kudos

Unable to create materialized view in Unity enabled Catalogues.

Hi Team,I was going through one of the videos of Databricks Sql Serverless and it say there is materialized view support . We can create materialized view .I tried same on my cluster of Sql Warehouse gives below error: 

image
  • 4484 Views
  • 3 replies
  • 6 kudos
Latest Reply
Felix
Databricks Employee
  • 6 kudos

Materialized views is in private preview right now afaik. Please talk to your account or customer success team at Databricks in order to sign up and enable it for your workspace. Thanks!

  • 6 kudos
2 More Replies
JoeWMP
by New Contributor III
  • 1452 Views
  • 1 replies
  • 7 kudos

All-purpose compute clusters that are attached to a pool are no longer able to switch to a different pool/change to a non-pool worker/driver.

Would like to know if anyone else is experiencing this - we're seeing this across 5+ different Databricks workspaces in both AWS and Azure.Reproduction: Create all purpose compute cluster, attach it to existing pool, save and start cluster. Edit clus...

image
  • 1452 Views
  • 1 replies
  • 7 kudos
Latest Reply
JoeWMP
New Contributor III
  • 7 kudos

We're also seeing the same behavior when trying to change the pool on an all-purpose cluster using Terraform and Databricks Labs Terraform provider as well. The Terraform apply will go through and say the cluster was updated to the new pool id, but t...

  • 7 kudos
JakeP
by New Contributor III
  • 2138 Views
  • 3 replies
  • 1 kudos

Resolved! Is there a way to create a path under /Repos via API?

Trying to use Repos API to automate creation and updates to repos under paths not specific to a user, i.e. /Repos/Admin/<repo-name>. It seems that creating a repo via POST to /api/2.0/repos will fail if you don't include a path, and will also fail i...

  • 2138 Views
  • 3 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

https://docs.databricks.com/dev-tools/api/latest/workspace.html#mkdirs try through Workspace API:curl --netrc --request POST \ https://dbc-a1b2345c-d6e7.cloud.databricks.com/api/2.0/workspace/mkdirs \ --header 'Accept: application/json' \ --dat...

  • 1 kudos
2 More Replies
SailajaB
by Valued Contributor III
  • 7983 Views
  • 12 replies
  • 4 kudos

Resolved! JSON validation is getting failed after writing Pyspark dataframe to json format

Hi We have to convert transformed dataframe to json format. So we used write and json format on top of final dataframe to convert it to json. But when we validating the output json its not in proper json format.Could you please provide your suggestio...

  • 7983 Views
  • 12 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

@Sailaja B​ - Does @Aman Sehgal​'s most recent answer help solve the problem? If it does, would you be happy to mark their answer as best?

  • 4 kudos
11 More Replies
SailajaB
by Valued Contributor III
  • 3730 Views
  • 4 replies
  • 6 kudos

Resolved! how to create a nested(unflatten) json from flatten json

Hi ,Is there any function in pyspark which can convert flatten json to nested json.Ex : if we have attribute in flatten is like a_b_c : 23then in unflatten it should be{"a":{"b":{"c":23}}}Thank you

  • 3730 Views
  • 4 replies
  • 6 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 6 kudos

As @Chuck Connell​ said can you share more of your source json as that example is not json. Additionally flatten is usually to change something like {"status": {"A": 1,"B": 2}} to {"status.A": 1, "status.B": 2} which can be done easily with spark da...

  • 6 kudos
3 More Replies
William_Scardua
by Valued Contributor
  • 9297 Views
  • 6 replies
  • 3 kudos

Resolved! How do you create a Sandbox in your data environment ?

Hi guys,How do you create a Sandbox in your data environment ? have any idea ?Azzure/AWS + Data Lake + Databricks

  • 9297 Views
  • 6 replies
  • 3 kudos
Latest Reply
missyT
New Contributor III
  • 3 kudos

In a sandbox environment, you will find the Designer enabled. You can activate Designer by selecting the design icon Designer. on a page, or by choosing the Design menu item in the Settings Settings menu.

  • 3 kudos
5 More Replies
ashu208
by New Contributor
  • 1779 Views
  • 4 replies
  • 0 kudos

I am not able to create a cluster

Hi,I am new on the Databricks platform, few weeks before I created a community version and it was working perfectly till 2 days before, now I can not create a cluster anymore, after few minutes it time out whenever I am trying to create a new cluster...

  • 1779 Views
  • 4 replies
  • 0 kudos
Latest Reply
Dileep_Vidyadar
New Contributor III
  • 0 kudos

Hi @Ashwinkumar Jayakumar​  and @Prabakar Ammeappin​ , I am facing the same issue for 3-4 days.Is there something wrong with Community Edition right now or does my account facing some issues?

  • 0 kudos
3 More Replies
Nick_Hughes
by New Contributor III
  • 2037 Views
  • 3 replies
  • 3 kudos

Is there an alerting API please?

Is there an alerting api so that alerts can be source controlled and automated, please ?https://docs.databricks.com/sql/user/alerts/index.html

  • 2037 Views
  • 3 replies
  • 3 kudos
Latest Reply
Dan_Z
Databricks Employee
  • 3 kudos

Hello @Nick Hughes​ , as of today we do not expose or document the API for these features. I think it will be a useful feature so I created an internal feature request for it (DB-I-4289). If you (or any future readers) want more information on this f...

  • 3 kudos
2 More Replies
aimas
by New Contributor III
  • 7671 Views
  • 8 replies
  • 5 kudos

Resolved! error creating tables using UI

Hi, i try to create a table using UI, but i keep getting the error "error creating table <table name> create a cluster first" even when i have a cluster alread running. what is the problem?

  • 7671 Views
  • 8 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 5 kudos

Be sure that cluster is selected (arrow in database) and at least there is Default database.

  • 5 kudos
7 More Replies
Labels