Data Engineering

Forum Posts

Sorted by:

by AlexandrePellet • New Contributor

08-23-2021 7:56:00 AM

1953 Views
4 replies
3 kudos

Resolved! Orchestration Preview - Error updating [JobName] - Either new_cluster or existing_cluster_id must be specified.

I've created a new job with the new UI / feature enabled. I managed to create one task with a new job cluster successfully but when adding a second task with a new job cluster and trying to save it I received the following error: Error updating [Job...

Data Engineering

1953 Views
4 replies
3 kudos

08-23-2021 7:56:00 AM

View Replies

Latest Reply

eignerfr
New Contributor II

07-22-2022 12:40:12 AM

3 kudos

Hi, I have a similar issue.I'm using the 14 day free trial, setting up the default basic-starter cluster.Then I just followed the basic introduction of dbx with python. https://docs.databricks.com/dev-tools/dbx.htmlCommand Execute is working, but whe...

3 kudos

07-22-2022 12:40:12 AM

3 More Replies

by Kash • Contributor III

07-20-2022 6:02:43 AM

1011 Views
3 replies
0 kudos

Resolved! Databricks Academy - DataBricks SQL missing for standard users?

Hi there,I'm taking a class in Databricks academy on DataBricks SQL and my company dashboard and community dashboard both are missing the SQL dropdown option. How can those taking classes in the academy actually use the tools we are learning about if...

Data Engineering

1011 Views
3 replies
0 kudos

07-20-2022 6:02:43 AM

View Replies

Latest Reply

Anonymous
Not applicable

07-21-2022 11:40:58 PM

0 kudos

https://community.databricks.com/s/question/0D58Y00008wA5klSAC/flatten-a-complex-json-file-and-load-into-a-delta-tableMyAARPMedicare.com

0 kudos

07-21-2022 11:40:58 PM

2 More Replies

by arda_123 • New Contributor III

07-21-2022 10:44:16 PM

389 Views
0 replies
0 kudos

Databricks Notebook Dashboard

I want to update one widget based on another widget. It gets updated but the dropdown shows the last selected in the dashboard view, but if I go to the notebook view from the dashboard view it updates. Any help? is it a bug?

Data Engineering

389 Views
0 replies
0 kudos

07-21-2022 10:44:16 PM

by arda_123 • New Contributor III

07-08-2022 7:45:21 AM

676 Views
2 replies
5 kudos

What happens when the same notebook is run by two different users at he same time? Is there a way to make sure they don't interfere with each others work?

Data Engineering

676 Views
2 replies
5 kudos

07-08-2022 7:45:21 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

07-08-2022 10:01:10 AM

5 kudos

It will be attached to the same cluster, and both will see live the same situation (so it will be, in fact, not run two times but one shared)You can use Repos and branches so everyone will get their branch and own notebook.

5 kudos

07-08-2022 10:01:10 AM

1 More Replies

by ao1 • New Contributor III

07-07-2022 9:23:21 PM

691 Views
2 replies
2 kudos

VPC status is BROKEN

Hi,AllWhen I check cloud resources, the VPC status is BROKEN.However, the cluster is runnning without any problems.What is the BROKEN state?And how can I get it healthy?Regards.

Data Engineering

691 Views
2 replies
2 kudos

07-07-2022 9:23:21 PM

View Replies

Latest Reply

Kaniz
Community Manager

07-13-2022 11:57:57 PM

2 kudos

Hi @aoi inanaga, Please restart your cluster and check how it's working.

2 kudos

07-13-2022 11:57:57 PM

1 More Replies

by jayallenmn • New Contributor III

07-20-2022 9:30:13 PM

725 Views
2 replies
0 kudos

Analyzing 23 GB JSON file

Hey all, We're trying to analyze the data in a 23 GB JSON file. We're using the basic starter cluster - one node, 2 cpu x 8GB.We can read the JSON file into a spark dataframe and print out the schema but if we try and do any operations that won't c...

Data Engineering

725 Views
2 replies
0 kudos

07-20-2022 9:30:13 PM

View Replies

Latest Reply

Prabakar
Esteemed Contributor III

07-21-2022 6:45:14 AM

0 kudos

Hi @Jay Allen you can refer to the cluster sizing doc.

0 kudos

07-21-2022 6:45:14 AM

1 More Replies

by Cozy • New Contributor III

07-03-2022 1:24:16 AM

1673 Views
6 replies
8 kudos

Resolved! How do I get information of aws cost?

Hi there.I can get Databricks cost(dbus) from usage_log. But, how do I get AWS cost information?I want to show Databricks and AWS cost in my Databricks SQL Dashborad.

Data Engineering

1673 Views
6 replies
8 kudos

07-03-2022 1:24:16 AM

View Replies

Latest Reply

Cozy
New Contributor III

07-21-2022 10:45:38 AM

8 kudos

Hi @Kaniz Fatma @Prabakar Ammeappin Sorry for the late reply.Every answers was helpful for me! My problem has been solved. Thanks!

8 kudos

07-21-2022 10:45:38 AM

5 More Replies

by Anonymous • Not applicable

07-21-2022 9:21:32 AM

402 Views
0 replies
0 kudos

How do I configure plot options through the `display` function as code (not through the interactive UI)?

How do I configure plot options through the `display` function as code (not through the interactive UI)? Specifically asking since when a notebook is scheduled to run in a databricks job there is no way of configuring the plot type.

Data Engineering

402 Views
0 replies
0 kudos

07-21-2022 9:21:32 AM

by vk217 • Contributor

07-20-2022 3:16:03 PM

1436 Views
3 replies
0 kudos

Resolved! Token management is not enabled for this feature tier

I want to create a personal access token for a service principal so that I can use that service principal personal access token in the databricks-connect configure command in an automated build. I followed the instructions from here.https://docs.data...

Data Engineering

1436 Views
3 replies
0 kudos

07-20-2022 3:16:03 PM

View Replies

Latest Reply

Atanu
Esteemed Contributor

07-21-2022 7:32:52 AM

0 kudos

@Vikas B https://docs.databricks.com/dev-tools/api/latest/scim/scim-sp.html#scim-api-20-serviceprincipals let me know if this helps.

0 kudos

07-21-2022 7:32:52 AM

2 More Replies

by Data_Engineer3 • Contributor II

07-11-2022 11:26:22 AM

1167 Views
2 replies
1 kudos

Unable to access Scala and python variables in-between shells in same notebook.

I am facing issue in while accessing python data frame in Scala shell and vice versa. I am getting error variable not defined.

Data Engineering

1167 Views
2 replies
1 kudos

07-11-2022 11:26:22 AM

View Replies

Latest Reply

tomasz
New Contributor III

07-11-2022 11:44:11 AM

1 kudos

The context is not shared between Scala and Python so you won't be able to access the same variables directly. However you can use createOrReplaceTempView to create a temporary view of your dataframe and read it in the other language with read_df = s...

1 kudos

07-11-2022 11:44:11 AM

1 More Replies

by Ian • New Contributor III

06-30-2022 6:21:13 AM

2014 Views
3 replies
1 kudos

Delta Live Tables - how do you merge multiple streaming datasets into a single one?

I need to create a dataset that is dependent on multiple streaming datasets. However, when I attempt to create the new single stream I am getting an error. Append output mode not supported when there are streaming aggregations on streaming DataFrame...

Data Engineering

2014 Views
3 replies
1 kudos

06-30-2022 6:21:13 AM

View Replies

Latest Reply

Ian
New Contributor III

07-21-2022 4:25:59 AM

1 kudos

Hi Kaniz/Jose, I was able to resolve the issue. I used 'union all' to avoid aggregation on the stream and have it continue to write to the table in append mode.This issue can be closed.

1 kudos

07-21-2022 4:25:59 AM

2 More Replies

by petilodie • New Contributor III

07-21-2022 1:11:16 AM

804 Views
1 replies
3 kudos

Resolved! How can I change the admin settings by Terraform?

I need to update most of the settings that are visible on the Admin Console UI by using Terraform. In another post in this forum I saw that I can use `custom_config` in a `databricks_workspace_conf` resource to achieve that but the options seem limit...

Data Engineering

804 Views
1 replies
3 kudos

07-21-2022 1:11:16 AM

View Replies

Latest Reply

petilodie
New Contributor III

07-21-2022 1:21:38 AM

3 kudos

Ok, looks like I can inspect the network and see which flags are sent to the endpoint. Tried that and it worked.

3 kudos

07-21-2022 1:21:38 AM

by hu_daa • New Contributor

06-28-2022 7:29:52 PM

591 Views
2 replies
0 kudos

Databricks and conda-env support

In my current company, we have a Hadoop cluster in which we extensively use conda environments and conda-packs. What are the requirements for Databricks to work with this setup?

Data Engineering

591 Views
2 replies
0 kudos

06-28-2022 7:29:52 PM

View Replies

Latest Reply

Kaniz
Community Manager

07-06-2022 9:28:00 AM

0 kudos

Hi @Hugo Ferreira, Databricks Runtime with Conda was a Databricks runtime based on Conda environments instead of Python virtual environments, available only in Beta. If you want to use Conda to manage Python libraries and environments, use a support...

0 kudos

07-06-2022 9:28:00 AM

1 More Replies

by 727123 • New Contributor

07-01-2022 7:55:33 AM

905 Views
3 replies
1 kudos

Resolved! unable to create delta tables in aws glue catalog

unable to create delta tables in aws glue catalogThe project requires that we integrate with the AWS Glue catalog.We would like to be able to create tables in delta format in the glue catalog.To test this functionality. We did the followingCreated th...

Data Engineering

905 Views
3 replies
1 kudos

07-01-2022 7:55:33 AM

View Replies

Latest Reply

Kaniz
Community Manager

07-07-2022 10:46:58 PM

1 kudos

Hi @extCheeren.John, We haven't heard from you since my last response, and I was checking back to see if my suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to others. Also, please do...

1 kudos

07-07-2022 10:46:58 PM

2 More Replies

by mgiglia • Contributor

07-03-2022 6:53:47 PM

1083 Views
4 replies
2 kudos

Resolved! GitHub Credentials Won’t Save on User Settings Page

When I attempt to save my username and token for Github I receive a “Failed to Save. Try again.” message. I’ve used Azure DevOps with another DB workspace and never had an issue saving my PAT. I’ve tried using both my GitHub username and email wi...

Data Engineering

1083 Views
4 replies
2 kudos

07-03-2022 6:53:47 PM

View Replies

Latest Reply

mgiglia
Contributor

07-06-2022 10:12:12 AM

2 kudos

Quick update that I’ve now attempted to save my PAT for Github using two different computers and browser types (Safari and Chrome) and all have given the same “Failed to save. Please try again” message. Thankfully I can still clone from public repo...

2 kudos

07-06-2022 10:12:12 AM

3 More Replies

User

Count

1601

736

343

284

247

Databricks

Forum Posts

Resolved! Orchestration Preview - Error updating [JobName] - Either new_cluster or existing_cluster_id must be specified.

Resolved! Databricks Academy - DataBricks SQL missing for standard users?

Databricks Notebook Dashboard

What happens when the same notebook is run by two different users at he same time? Is there a way to make sure they don't interfere with each others work?

VPC status is BROKEN

Analyzing 23 GB JSON file

Resolved! How do I get information of aws cost?

How do I configure plot options through the `display` function as code (not through the interactive UI)?

Resolved! Token management is not enabled for this feature tier

Unable to access Scala and python variables in-between shells in same notebook.

Delta Live Tables - how do you merge multiple streaming datasets into a single one?

Resolved! How can I change the admin settings by Terraform?

Databricks and conda-env support

Resolved! unable to create delta tables in aws glue catalog

Resolved! GitHub Credentials Won’t Save on User Settings Page

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...