cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Justine_Bieber
by New Contributor III
  • 5925 Views
  • 5 replies
  • 8 kudos

Resolved! Change schema when writing to the Delta format

Is it possible to reapply schema in delta files? For example, we have a history with field string but from some point, we need to replace string with struct.In my case merge option and overwrite schema don't work.

  • 5925 Views
  • 5 replies
  • 8 kudos
Latest Reply
Justine_Bieber
New Contributor III
  • 8 kudos

Hi guys! Definitely, thank you for your support.

  • 8 kudos
4 More Replies
natadhorcross
by New Contributor III
  • 1875 Views
  • 0 replies
  • 4 kudos

Hi, we encountered a problem of timeout every (60 / 80 minutes ) on a long execution of copy json into parquet file in datalake Gen 2.

First, our process was triggered by the datafactory. First the connexion was set with token access, then with managed service identity.We prove the untimely time out was not due to the datafactory by running directly the notebook. Secondly, we tried ...

  • 1875 Views
  • 0 replies
  • 4 kudos
elgeo
by Valued Contributor II
  • 2146 Views
  • 1 replies
  • 2 kudos

Moving data to a delta table keeping the old surrogate ids intact

Hello experts! We have a table in our current system that we need to move it (one-off) to a delta in Databricks keeping its Ids (surrogate keys) intact. We think to of the following steps: 1. create a new delta table with a "BIGINT GENERATED BY DEFAU...

  • 2146 Views
  • 1 replies
  • 2 kudos
Latest Reply
lizou
Contributor III
  • 2 kudos

same here, I submitted an idea in the azure databricks portalhttps://feedback.azure.com/d365community/idea/d403303c-6761-ed11-a81b-000d3ae5ae95SET IDENTITY_INSERT ONwhen a column is defined as GENERATED ALWAYS, we often need to reload data with exact...

  • 2 kudos
Chris_Konsur
by New Contributor III
  • 1793 Views
  • 0 replies
  • 2 kudos

Schema supported by Autoloader

We do not want to use schema inference with schema evolution in Autoloader. Instead, we want to apply our schema and use the merge option. Our schema is very complex, with multiple nested following levels. When I apply this schema to Autoloader, it r...

  • 1793 Views
  • 0 replies
  • 2 kudos
asma
by New Contributor II
  • 2758 Views
  • 2 replies
  • 6 kudos

Resolved! new learner

i am absolutely new for data bricks .. can someone suggest best medium to learn at the quick pace

  • 2758 Views
  • 2 replies
  • 6 kudos
Latest Reply
reno
New Contributor II
  • 6 kudos

In addition to paid content, there are bunch of really good free courses on Databricks Academy now : https://customer-academy.databricks.com/learn/catalog?ctldoc-catalog-0=t-_%22learning_plan%22~p-0Just pick the Data engineer or the Data analyst lear...

  • 6 kudos
1 More Replies
Pat
by Esteemed Contributor
  • 4006 Views
  • 3 replies
  • 19 kudos

Resolved! UC - Service Principal/Terraform

Hi,do you know if there is a way to create Unity Catalog metastore using Service Principal?Here I can see that for creating account-level resources we need to provide a user and password (https://registry.terraform.io/providers/databricks/databricks/...

  • 4006 Views
  • 3 replies
  • 19 kudos
Latest Reply
Pat
Esteemed Contributor
  • 19 kudos

This is supported right now in the Azure, but not yet in AWS, but there is plan for AWS support as well.

  • 19 kudos
2 More Replies
lizou
by Contributor III
  • 4867 Views
  • 3 replies
  • 5 kudos

Resolved! Identity column definition lost using save as table

I found an issue:For a table with an identity column defined.when the table column is renamed using this method, the identity definition will be removed. That means using an identity column in a table requires extra attention to check whether the ide...

  • 4867 Views
  • 3 replies
  • 5 kudos
Latest Reply
lizou
Contributor III
  • 5 kudos

try to avoid reload table, I found we can upgrade table version, and use rename column commandALTER TABLE test_id2 SET TBLPROPERTIES (  'delta.columnMapping.mode' = 'name',  'delta.minReaderVersion' = '2',  'delta.minWriterVersion' = '6')ALTER TABLE ...

  • 5 kudos
2 More Replies
164079
by Contributor II
  • 6405 Views
  • 3 replies
  • 1 kudos

Resolved! unable to add new instance profile

Hi team, I want to start adding more instance profile per team . when adding it via TF , im getting the below error:Im able BTW to add and change other databricks resources via TF .This is my new code block:The new role created by the TF but wasnt ad...

image image image
  • 6405 Views
  • 3 replies
  • 1 kudos
Latest Reply
164079
Contributor II
  • 1 kudos

Thank you @Vivian Wilfred​ all ok now, with the databricks console and the TFHave a graet day!

  • 1 kudos
2 More Replies
User16752242161
by Databricks Employee
  • 2083 Views
  • 0 replies
  • 1 kudos

Hi,I am a Solutions Architect at Databricks Northern Europe. I had a deep dive demo session at Data + AI Work Tour in London on Nov 2, 2022. The title...

Hi,I am a Solutions Architect at Databricks Northern Europe. I had a deep dive demo session at Data + AI Work Tour in London on Nov 2, 2022. The title was How to Build Your Modern Data Stack on Databricks to Solve Modern Problems.In the above-mention...

  • 2083 Views
  • 0 replies
  • 1 kudos
anandstarz
by New Contributor II
  • 2652 Views
  • 4 replies
  • 1 kudos

Certificate not received

I have passed Databricks Certified Data Engineer Associate on 29 October 2022 but still didn't receive the certificate, kindly help me on that.

  • 2652 Views
  • 4 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Anandhakumar R​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue.If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please visit the ...

  • 1 kudos
3 More Replies
Amol
by New Contributor II
  • 2624 Views
  • 4 replies
  • 0 kudos

I have passed my data analyst associate exam on 30th October , but still not received my certificate, its been more than than 4 days now , however Dat...

I have passed my data analyst associate exam on 30th October , but still not received my certificate, its been more than than 4 days now , however Databricks mentioned in an email that will deliver within 24 hours. Can anyone help on this ?

  • 2624 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Amol Metkar​ Glad to hear that. Thanks for letting us know!

  • 0 kudos
3 More Replies
RiyazAliM
by Honored Contributor
  • 9001 Views
  • 3 replies
  • 7 kudos

Resolved! Converting a transformation written in Spark Scala to PySpark

Hello all,I've been tasked to convert a Scala Spark code to PySpark code with minimal changes (kinda literal translation).I've come across some code that claims to be a list comprehension. Look below for code snippet:%scala val desiredColumn = Seq("f...

  • 9001 Views
  • 3 replies
  • 7 kudos
Latest Reply
RiyazAliM
Honored Contributor
  • 7 kudos

Another follow-up question, if you don't mind. @Pat Sienkiewicz​ As I was trying to parse the name column into multiple columns. I came across the data below:("James,\"A,B\", Smith", "2018", "M", 3000)In order to parse these comma-included middle na...

  • 7 kudos
2 More Replies
Anonymous
by Not applicable
  • 4364 Views
  • 4 replies
  • 19 kudos

Resolved! How to prepare for Databricks Data Engineer Professional

Hi all,Could you please help suggest me some resource to prepare for " Databricks Data Engineer Professional" exam?I have also take the course in Databricks Accademy but seems not enough for this exam?Thank you so much!!!Best Regards,Nhan Nguyen

  • 4364 Views
  • 4 replies
  • 19 kudos
Latest Reply
Unforgiven
Valued Contributor III
  • 19 kudos

waitting some one post more details and experience road map to take exam

  • 19 kudos
3 More Replies
Bujji
by New Contributor II
  • 6730 Views
  • 1 replies
  • 3 kudos

How to resolve our of memory error?

Hi, I am working as azure support engineerI found this error while I am checking the pipeline failure, and showing below error"org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 72403.0 failed 4 times, most recent fail...

  • 6730 Views
  • 1 replies
  • 3 kudos
Latest Reply
Pat
Esteemed Contributor
  • 3 kudos

Hi @mahesh bmk​ ,It would be nice to see the sql_query.is there some window function used? You might try to run this on bigger cluster.

  • 3 kudos
Labels