cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

aschiff
by Contributor II
  • 47993 Views
  • 24 replies
  • 4 kudos

Resolved! Extracting data from a multi-layered JSON object

I have a table in databricks called owner_final_delta with a column called contacts that holds data with this structure:array<struct<address:struct<apartment:string,city:string,house:string,poBox:string,sources:array<string>,state:string,street:strin...

  • 47993 Views
  • 24 replies
  • 4 kudos
Latest Reply
Dooley
Databricks Employee
  • 4 kudos

Have you tried to use the explode function for that column with the array?df.select(explode(df.emailId).alias("email")).show()----------Also, if you are a SQL lover, you can instead use the Databricks syntax for querying a JSON seen here.

  • 4 kudos
23 More Replies
StackP
by New Contributor
  • 3655 Views
  • 1 replies
  • 0 kudos

How to add unique consecutive id to delta lake table

In Databricks I have a existing delta table, In which i want to add one more column, as Id so that each row has unique id no and It is consecutive (how primary key is present in sql).So far I have tried converting delta table to pyspark dataframe and...

  • 3655 Views
  • 1 replies
  • 0 kudos
Latest Reply
Sandeep
Databricks Employee
  • 0 kudos

How about defining an identity column as below?GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY [ ( [ START WITH start ] [ INCREMENT BY step ] ) ]https://docs.databricks.com/sql/language-manual/sql-ref-syntax-ddl-create-table-using.html#parameters

  • 0 kudos
BradSheridan
by Valued Contributor
  • 2292 Views
  • 2 replies
  • 1 kudos

Resolved! Add an Instance Profile to a DLT job cluster

@Tomasz Bacewicz​ I've got another, related question for you about the job cluster that is spun up for DLT jobs. Adding the JSON strings for our required E2 tags worked like a charm, but now I need to attach an existing Instance Profile since I'm tr...

  • 2292 Views
  • 2 replies
  • 1 kudos
Latest Reply
tomasz
Databricks Employee
  • 1 kudos

@Brad Sheridan​ To do that you have to add the aws_attributes tag within a cluster configuration and there you have the ability to add an instance_profile_arn like so:"clusters": [ { "label": "default", "aws_attributes": { ...

  • 1 kudos
1 More Replies
junaid
by New Contributor II
  • 8632 Views
  • 0 replies
  • 1 kudos

We are seeing "BOOTSTRAP_TIMEOUT" issue in a new workspace.

When attempting to deploy/start an Databricks cluster on AWS through the UI, the following error consistently occurs:Bootstrap Timeout:[id: InstanceId(i-093caac78cdbfa7e1), status: INSTANCE_INITIALIZING, workerEnvId:WorkerEnvId(workerenv-335698072713...

  • 8632 Views
  • 0 replies
  • 1 kudos
mehdi1
by New Contributor III
  • 12484 Views
  • 8 replies
  • 12 kudos

Resolved! How to programmatically create a widget?

I know the dbutils.widget.text to create a widget in a notebook. So for me the workflow 1. Having a notebook2. Use dbutils.widget.text (or other type of widgets) once in a notebook cell to create a widget3. Remove the cell containing dbutils.widget...

  • 12484 Views
  • 8 replies
  • 12 kudos
Latest Reply
RachelGomez123
New Contributor II
  • 12 kudos

@Mehdi BEN ABDESSELEM​ , Steps for Creating a Basic WidgetStep 1: Create a New ProjectTo create a new project in Android Studio, please refer to How to Create/Start a New Project in Android Studio. We are implementing it for both Java and Kotlin lang...

  • 12 kudos
7 More Replies
Sam
by New Contributor III
  • 5589 Views
  • 3 replies
  • 6 kudos

Resolved! QuantileDiscretizer not respecting NumBuckets

I have set numBuckets and numBucketsArray for a group of columns to bin them into 5 buckets.Unfortunately the number of buckets does not seem to be respected across all columns even though there is variation within them.I have tried setting the relat...

  • 5589 Views
  • 3 replies
  • 6 kudos
Latest Reply
Sam
New Contributor III
  • 6 kudos

Thank you.What I did was:Apply QuntileBucketizer to Non-Zeros and specified a very small value (bottom 1%) to capture the lower bucket including zeroes.That fixed the issue! You can define your own splits which would work as well but the splits thems...

  • 6 kudos
2 More Replies
RengarLee
by Contributor
  • 8411 Views
  • 9 replies
  • 6 kudos

Resolved! The Databricks-academy question

I'm learning the  Data Engineeing with Databricks of Course, I have a question.if I run cmd4, it tells me an error.Course URL:https://customer-academy.databricks.com/learn/course/62/play/4290/providing-options-for-external-sources;lp=10Chapter: DE 4....

  • 8411 Views
  • 9 replies
  • 6 kudos
Latest Reply
Panna
New Contributor II
  • 6 kudos

Same issue occurred to me

  • 6 kudos
8 More Replies
Kash
by Contributor III
  • 3755 Views
  • 6 replies
  • 7 kudos

Where is Alerts in the sidebar?

Hi everyone,I can't seem to find Alerts in the sidebar, also my data-explorer looks different from what I see in the videos. Do I need to upgrade my environment? Thanks,K

  • 3755 Views
  • 6 replies
  • 7 kudos
Latest Reply
Kash
Contributor III
  • 7 kudos

Hi group,After speaking with my rep, it appears that Databricks ALERTS is only for premium members even though that is not what is advertised on the site or in the documentation. This is unfortunate as data-quality is a concern for us and we don't fe...

  • 7 kudos
5 More Replies
chandan_a_v
by Valued Contributor
  • 13742 Views
  • 8 replies
  • 6 kudos

Resolved! logging.basicConfig not creating a file in Databricks

Hi,I am using the logger to log some parameters in my code and I want to save the file under DBFS. But for some reason the file is not getting created under DBFS. If I clear the state of the notebook and check the DBFS dir then file is present. Pleas...

  • 13742 Views
  • 8 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Perhaps PyCharm sets a different working directory, meaning the file ends up in another place. Try providing a full path.

  • 6 kudos
7 More Replies
merca
by Valued Contributor II
  • 11328 Views
  • 6 replies
  • 3 kudos

⬆ Bump IPython to 7.31.1

Any plans to bump IPython version to 7.31.1 on the DBR 9.1 LTS runtime? If no other motivation

  • 11328 Views
  • 6 replies
  • 3 kudos
Latest Reply
merca
Valued Contributor II
  • 3 kudos

Hi @Kaniz Fatma​ !I just checked DBR 10.5 release notes and the IPython version listed there is 7.22.0 installed, but the version that is patched for this security issue is 7.31.1 Did I misread your last comment that it will be upgraded to safe versi...

  • 3 kudos
5 More Replies
GeorgeP
by New Contributor II
  • 2513 Views
  • 2 replies
  • 2 kudos

Errors when querying Azure DataBricks through DBeaver on macos

Configured DBeaver to work with either databricks latest driver or simba. I can connect and see databases, schemas, tables and columns. However, when a select statement is executed 30-40 seconds go by before I get the following error message: SQL...

  • 2513 Views
  • 2 replies
  • 2 kudos
Latest Reply
sage5616
Valued Contributor
  • 2 kudos

Has this issue been resolved? @aravhish solution did not help me. Any other options?I am experiencing the exact same issue with the same configuration on a Mac. Much help would be appreciated.

  • 2 kudos
1 More Replies
Ignacio33
by New Contributor II
  • 2049 Views
  • 2 replies
  • 1 kudos

"Backend services unavailable" when creating a default cluster

Hello all, I have been not using databricks CE for 4-5 months but today I've tried to create the default cluster, as always did, and I've got the error "Backend services unavailable". Is this a temporary problem or am I doing something wrong. Thanks ...

  • 2049 Views
  • 2 replies
  • 1 kudos
Latest Reply
tomasz
Databricks Employee
  • 1 kudos

There's currently an outage of the community edition. Please follow this link for status:https://status.databricks.com/pages/incident/5cf02dde58a00904bda41926/62cd7677db464e053416a89c

  • 1 kudos
1 More Replies
Judha2022
by New Contributor III
  • 3300 Views
  • 4 replies
  • 2 kudos
  • 3300 Views
  • 4 replies
  • 2 kudos
Latest Reply
Judha2022
New Contributor III
  • 2 kudos

Could you please let me know when it is available? It is critically important for me to get Databricks CE.Thanks again for your reply.

  • 2 kudos
3 More Replies
Mohit_m
by Databricks Employee
  • 15607 Views
  • 1 replies
  • 2 kudos

Resolved! Job is failing with exception ClientAuthenticationError: DefaultAzureCredential failed to retrieve a token from the included credentials.

ClientAuthenticationError: DefaultAzureCredential failed to retrieve a token from the included credentials.Attempted credentials:EnvironmentCredential: EnvironmentCredential authentication unavailable. Environment variables are not fully configured.V...

  • 15607 Views
  • 1 replies
  • 2 kudos
Latest Reply
Mohit_m
Databricks Employee
  • 2 kudos

Below docs are for reference:https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/identity/azure-identity/migration_guide.mdthere was a suggestion given to usefrom azure.common.credentials import ServicePrincipalCredentialsinstead offrom azure...

  • 2 kudos
brickster_2018
by Databricks Employee
  • 6741 Views
  • 2 replies
  • 0 kudos

Resolved! The job fails with HTTP 403

My jobs that are running for more than 48 hours are failing with HTTP 403 error

  • 6741 Views
  • 2 replies
  • 0 kudos
Latest Reply
willjoe
New Contributor III
  • 0 kudos

Check for URL errors and make sure you're specifying an actual web page file name and extension, not just a directory. Most websites are configured to disallow directory browsing, so a 403 Forbidden message when trying to display a folder instead of ...

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels