Data Engineering

Forum Posts

Sorted by:

Start a conversation

by Mac3 • New Contributor II

06-28-2023 2:32:27 PM

444 Views
1 replies
1 kudos

Data Engineering

What is a common problem you face when ingesting data in hot storage?

Data Engineering

444 Views
1 replies
1 kudos

06-28-2023 2:32:27 PM

View Replies

Latest Reply

Seb
New Contributor II

06-28-2023 3:05:22 PM

1 kudos

Finding the best partitioning strategy and also cost issue if you ingest huge amount of data

1 kudos

06-28-2023 3:05:22 PM

by JoshGallant • New Contributor

06-28-2023 3:04:58 PM

246 Views
0 replies
0 kudos

Delta Live Tables Best Practices

Delta Live Tables Best Practices session today was fantastic.

Data Engineering

246 Views
0 replies
0 kudos

06-28-2023 3:04:58 PM

by dipayanpwc • New Contributor

06-28-2023 2:59:25 PM

230 Views
0 replies
0 kudos

Key Note

Data Engineering

230 Views
0 replies
0 kudos

06-28-2023 2:59:25 PM

by DebanjanB • New Contributor

06-28-2023 2:53:04 PM

288 Views
0 replies
0 kudos

Autloader processing semantics

Queue based Autoloader processes files in the order they are received only when the job is up and running. However when the job is down, the files that queue up are processed in lexical order once the job is up. Since autoloader jobs need to be stopp...

Data Engineering

288 Views
0 replies
0 kudos

06-28-2023 2:53:04 PM

by whleeman • New Contributor III

06-28-2023 2:27:46 PM

307 Views
0 replies
0 kudos

Does Databricks get cached result for a subquery?

If I run a query as "SELECT fare_amount FROM nyctaxi.trips where fare_amount > 1.5". The query results will be cached for 24 hours.I then compose a second query using the previous query as a subquery "SELECT * FROM nyctaxi.trips WHERE fare_amount IN...

Data Engineering

307 Views
0 replies
0 kudos

06-28-2023 2:27:46 PM

by Raechel • New Contributor

06-28-2023 2:21:45 PM

223 Views
0 replies
0 kudos

ETL TRAINING

Loved the delta live tables training

Data Engineering

223 Views
0 replies
0 kudos

06-28-2023 2:21:45 PM

by bschmirler • New Contributor

06-28-2023 2:20:22 PM

219 Views
0 replies
0 kudos

Databricks Conf - the DBT sesison eatlier today was 🔥

Data Engineering

219 Views
0 replies
0 kudos

06-28-2023 2:20:22 PM

by Tidaldata • New Contributor

06-28-2023 2:09:54 PM

228 Views
0 replies
0 kudos

Loveing Databricks Summit

Loving the summit so far, awesome keynote speakers, great trainers and paid courses. Finished certification #databrickslearning

Data Engineering

228 Views
0 replies
0 kudos

06-28-2023 2:09:54 PM

by JohanBringsdal • New Contributor

06-28-2023 2:08:44 PM

338 Views
0 replies
0 kudos

Migrating old solution to new optimal delta lake setup

Hi Databricks community!I have previsouly worked on a project that easily could be optimized with Databricks. It is currently running on Azure Synapse, but the premise is the same.Ill describe the scenario here:1. Data owners send a constant flow of ...

Data Engineering

338 Views
0 replies
0 kudos

06-28-2023 2:08:44 PM

by christo_M • New Contributor

06-28-2023 1:48:55 PM

442 Views
1 replies
0 kudos

Data Governance

How can I propagate a deletion to all tables where a customer requests to be removed from the database as part of the GDPR compliance ?

Data Engineering

442 Views
1 replies
0 kudos

06-28-2023 1:48:55 PM

View Replies

Latest Reply

Datajoe
Contributor

06-28-2023 1:58:04 PM

0 kudos

We use a python script that enables and removes access to tables based on role-group, but can be user as well. Also have a script that removes all access- can be executes in seconds.

0 kudos

06-28-2023 1:58:04 PM

by matmat13 • New Contributor II

08-10-2022 6:06:11 AM

8091 Views
20 replies
11 kudos

Resolved! Lakehouse Fundamentals Certificate/Badge not appearing

Hello! I just passed the Lakehouse Fundamentals Accreditation and I haven't received any badge or certificate for it. I understand that I need to go to credentials.databricks.com but it is not there. How long before it appears? Need help

Data Engineering

8091 Views
20 replies
11 kudos

08-10-2022 6:06:11 AM

View Replies

Latest Reply

andy25
New Contributor II

06-28-2023 1:52:41 PM

11 kudos

Hello can you help me, I have the same problem. I don't have the certificate available on credentials.databricks.com.

11 kudos

06-28-2023 1:52:41 PM

19 More Replies

by teja_7 • New Contributor

06-28-2023 1:31:29 PM

291 Views
0 replies
0 kudos

DataSummit 2023

Happy to be part of the data summit 2023. Wondering if DBSQL is enabled for DLT tables

Data Engineering

Summit23

291 Views
0 replies
0 kudos

06-28-2023 1:31:29 PM

by Bollam • New Contributor II

06-28-2023 1:29:21 PM

325 Views
0 replies
0 kudos

Utility catalog and data governance

Attended the Data and AI Summit 2023 and gained insights into the utility catalog and services that it has to offer, definitely going to try the data governance as it's a game changer.

Data Engineering

325 Views
0 replies
0 kudos

06-28-2023 1:29:21 PM

by MRodek • New Contributor

06-28-2023 1:27:21 PM

293 Views
0 replies
0 kudos

Unity Catalog and Glue Serverless?

Can I reference theUnity Catalog through my Glue Serverless jobs?

Data Engineering

293 Views
0 replies
0 kudos

06-28-2023 1:27:21 PM

by Ankith • New Contributor

06-27-2023 10:35:59 PM

1970 Views
1 replies
1 kudos

Converting column of XML strings to column of Jsons

Hi,I want to convert column of XML strings to column of Json in PySpark., using withcolumn and xmltodict method as UDF, is giving Json with '=' instead of ':' in the dictionary. Please let me know if there is any alternative for this.

Data Engineering

1970 Views
1 replies
1 kudos

06-27-2023 10:35:59 PM

View Replies

Latest Reply

DessertKid
New Contributor II

06-28-2023 12:29:19 PM

1 kudos

To convert a column of XML strings to a column of JSON in PySpark, you can use the `from_json` function along with the `xmltodict` library. However, instead of using a UDF with `withColumn`, you can use the `select` function to transform the column.

1 kudos

06-28-2023 12:29:19 PM

User

Count

1601

736

343

284

247

Databricks

Forum Posts

Data Engineering

Delta Live Tables Best Practices

Key Note

Autloader processing semantics

Does Databricks get cached result for a subquery?

ETL TRAINING

Databricks Conf - the DBT sesison eatlier today was 🔥

Loveing Databricks Summit

Migrating old solution to new optimal delta lake setup

Data Governance

Resolved! Lakehouse Fundamentals Certificate/Badge not appearing

DataSummit 2023

Utility catalog and data governance

Unity Catalog and Glue Serverless?

Converting column of XML strings to column of Jsons

Best way to parse Google Analytics data in Databri...

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...