cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Biber
by New Contributor III
  • 3339 Views
  • 5 replies
  • 8 kudos

Resolved! Change schema when writing to the Delta format

Is it possible to reapply schema in delta files? For example, we have a history with field string but from some point, we need to replace string with struct.In my case merge option and overwrite schema don't work.

  • 3339 Views
  • 5 replies
  • 8 kudos
Latest Reply
Biber
New Contributor III
  • 8 kudos

Hi guys! Definitely, thank you for your support.

  • 8 kudos
4 More Replies
natadhorcross
by New Contributor III
  • 1302 Views
  • 0 replies
  • 4 kudos

Hi, we encountered a problem of timeout every (60 / 80 minutes ) on a long execution of copy json into parquet file in datalake Gen 2.

First, our process was triggered by the datafactory. First the connexion was set with token access, then with managed service identity.We prove the untimely time out was not due to the datafactory by running directly the notebook. Secondly, we tried ...

  • 1302 Views
  • 0 replies
  • 4 kudos
elgeo
by Valued Contributor II
  • 1330 Views
  • 1 replies
  • 2 kudos

Moving data to a delta table keeping the old surrogate ids intact

Hello experts! We have a table in our current system that we need to move it (one-off) to a delta in Databricks keeping its Ids (surrogate keys) intact. We think to of the following steps: 1. create a new delta table with a "BIGINT GENERATED BY DEFAU...

  • 1330 Views
  • 1 replies
  • 2 kudos
Latest Reply
lizou
Contributor II
  • 2 kudos

same here, I submitted an idea in the azure databricks portalhttps://feedback.azure.com/d365community/idea/d403303c-6761-ed11-a81b-000d3ae5ae95SET IDENTITY_INSERT ONwhen a column is defined as GENERATED ALWAYS, we often need to reload data with exact...

  • 2 kudos
Chris_Konsur
by New Contributor III
  • 1148 Views
  • 0 replies
  • 2 kudos

Schema supported by Autoloader

We do not want to use schema inference with schema evolution in Autoloader. Instead, we want to apply our schema and use the merge option. Our schema is very complex, with multiple nested following levels. When I apply this schema to Autoloader, it r...

  • 1148 Views
  • 0 replies
  • 2 kudos
asma
by New Contributor II
  • 1935 Views
  • 2 replies
  • 6 kudos

Resolved! new learner

i am absolutely new for data bricks .. can someone suggest best medium to learn at the quick pace

  • 1935 Views
  • 2 replies
  • 6 kudos
Latest Reply
reno
New Contributor II
  • 6 kudos

In addition to paid content, there are bunch of really good free courses on Databricks Academy now : https://customer-academy.databricks.com/learn/catalog?ctldoc-catalog-0=t-_%22learning_plan%22~p-0Just pick the Data engineer or the Data analyst lear...

  • 6 kudos
1 More Replies
Pat
by Honored Contributor III
  • 2477 Views
  • 3 replies
  • 19 kudos

Resolved! UC - Service Principal/Terraform

Hi,do you know if there is a way to create Unity Catalog metastore using Service Principal?Here I can see that for creating account-level resources we need to provide a user and password (https://registry.terraform.io/providers/databricks/databricks/...

  • 2477 Views
  • 3 replies
  • 19 kudos
Latest Reply
Pat
Honored Contributor III
  • 19 kudos

This is supported right now in the Azure, but not yet in AWS, but there is plan for AWS support as well.

  • 19 kudos
2 More Replies
lizou
by Contributor II
  • 3381 Views
  • 3 replies
  • 5 kudos

Resolved! Identity column definition lost using save as table

I found an issue:For a table with an identity column defined.when the table column is renamed using this method, the identity definition will be removed. That means using an identity column in a table requires extra attention to check whether the ide...

  • 3381 Views
  • 3 replies
  • 5 kudos
Latest Reply
lizou
Contributor II
  • 5 kudos

try to avoid reload table, I found we can upgrade table version, and use rename column commandALTER TABLE test_id2 SET TBLPROPERTIES (  'delta.columnMapping.mode' = 'name',  'delta.minReaderVersion' = '2',  'delta.minWriterVersion' = '6')ALTER TABLE ...

  • 5 kudos
2 More Replies
164079
by Contributor II
  • 3395 Views
  • 3 replies
  • 1 kudos

Resolved! unable to add new instance profile

Hi team, I want to start adding more instance profile per team . when adding it via TF , im getting the below error:Im able BTW to add and change other databricks resources via TF .This is my new code block:The new role created by the TF but wasnt ad...

image image image
  • 3395 Views
  • 3 replies
  • 1 kudos
Latest Reply
164079
Contributor II
  • 1 kudos

Thank you @Vivian Wilfred​ all ok now, with the databricks console and the TFHave a graet day!

  • 1 kudos
2 More Replies
User16752242161
by New Contributor II
  • 571 Views
  • 0 replies
  • 1 kudos

Hi,I am a Solutions Architect at Databricks Northern Europe. I had a deep dive demo session at Data + AI Work Tour in London on Nov 2, 2022. The title...

Hi,I am a Solutions Architect at Databricks Northern Europe. I had a deep dive demo session at Data + AI Work Tour in London on Nov 2, 2022. The title was How to Build Your Modern Data Stack on Databricks to Solve Modern Problems.In the above-mention...

  • 571 Views
  • 0 replies
  • 1 kudos
anandstarz
by New Contributor II
  • 1644 Views
  • 4 replies
  • 1 kudos

Certificate not received

I have passed Databricks Certified Data Engineer Associate on 29 October 2022 but still didn't receive the certificate, kindly help me on that.

  • 1644 Views
  • 4 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Anandhakumar R​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue.If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please visit the ...

  • 1 kudos
3 More Replies
Amol
by New Contributor II
  • 1637 Views
  • 4 replies
  • 0 kudos

I have passed my data analyst associate exam on 30th October , but still not received my certificate, its been more than than 4 days now , however Dat...

I have passed my data analyst associate exam on 30th October , but still not received my certificate, its been more than than 4 days now , however Databricks mentioned in an email that will deliver within 24 hours. Can anyone help on this ?

  • 1637 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Amol Metkar​ Glad to hear that. Thanks for letting us know!

  • 0 kudos
3 More Replies
Anonymous
by Not applicable
  • 8765 Views
  • 8 replies
  • 7 kudos

Resolved! data frame takes unusually long time to write for small data sets

We have configured workspace with own vpc. We need to extract data from DB2 and write as delta format. we tried to for 550k records with 230 columns, it took 50mins to complete the task. 15mn records takes more than 18hrs. Not sure why this takes suc...

  • 8765 Views
  • 8 replies
  • 7 kudos
Latest Reply
elgeo
Valued Contributor II
  • 7 kudos

Hello. We face exactly the same issue. Reading is quick but writing takes long time. Just to clarify that it is about a table with only 700k rows. Any suggestions please? Thank youremote_table = spark.read.format ( "jdbc" ) \.option ( "driver" , "com...

  • 7 kudos
7 More Replies
RiyazAli
by Valued Contributor
  • 6381 Views
  • 3 replies
  • 7 kudos

Resolved! Converting a transformation written in Spark Scala to PySpark

Hello all,I've been tasked to convert a Scala Spark code to PySpark code with minimal changes (kinda literal translation).I've come across some code that claims to be a list comprehension. Look below for code snippet:%scala val desiredColumn = Seq("f...

  • 6381 Views
  • 3 replies
  • 7 kudos
Latest Reply
RiyazAli
Valued Contributor
  • 7 kudos

Another follow-up question, if you don't mind. @Pat Sienkiewicz​ As I was trying to parse the name column into multiple columns. I came across the data below:("James,\"A,B\", Smith", "2018", "M", 3000)In order to parse these comma-included middle na...

  • 7 kudos
2 More Replies
Anonymous
by Not applicable
  • 1986 Views
  • 4 replies
  • 19 kudos

Resolved! How to prepare for Databricks Data Engineer Professional

Hi all,Could you please help suggest me some resource to prepare for " Databricks Data Engineer Professional" exam?I have also take the course in Databricks Accademy but seems not enough for this exam?Thank you so much!!!Best Regards,Nhan Nguyen

  • 1986 Views
  • 4 replies
  • 19 kudos
Latest Reply
Unforgiven
Valued Contributor III
  • 19 kudos

waitting some one post more details and experience road map to take exam

  • 19 kudos
3 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels