cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

DebanjanB
by New Contributor
  • 408 Views
  • 0 replies
  • 0 kudos

Autloader processing semantics

Queue based Autoloader processes files in the order they are received only when the job is up and running. However when the job is down, the files that queue up are processed in lexical order once the job is up. Since autoloader jobs need to be stopp...

  • 408 Views
  • 0 replies
  • 0 kudos
whleeman
by New Contributor III
  • 481 Views
  • 0 replies
  • 0 kudos

Does Databricks get cached result for a subquery?

If I run a query as "SELECT fare_amount FROM nyctaxi.trips where fare_amount > 1.5".  The query results will be cached for 24 hours.I then compose a second query using the previous query as a subquery "SELECT * FROM nyctaxi.trips WHERE fare_amount IN...

  • 481 Views
  • 0 replies
  • 0 kudos
Tidaldata
by New Contributor
  • 631 Views
  • 0 replies
  • 0 kudos

Loveing Databricks Summit

Loving the summit so far, awesome keynote speakers, great trainers and paid courses. Finished certification #databrickslearning

  • 631 Views
  • 0 replies
  • 0 kudos
JohanBringsdal
by New Contributor
  • 613 Views
  • 0 replies
  • 0 kudos

Migrating old solution to new optimal delta lake setup

Hi Databricks community!I have previsouly worked on a project that easily could be optimized with Databricks. It is currently running on Azure Synapse, but the premise is the same.Ill describe the scenario here:1. Data owners send a constant flow of ...

  • 613 Views
  • 0 replies
  • 0 kudos
christo_M
by New Contributor
  • 836 Views
  • 1 replies
  • 0 kudos

Data Governance

How can I propagate a deletion to all tables where a customer requests to be removed from the database as part of the GDPR compliance ?  

  • 836 Views
  • 1 replies
  • 0 kudos
Latest Reply
Datajoe
Contributor
  • 0 kudos

We use a python script that enables and removes access to tables based on role-group, but can be user as well. Also have a script that removes all access- can be executes in seconds.

  • 0 kudos
matmat13
by New Contributor II
  • 12661 Views
  • 17 replies
  • 10 kudos

Resolved! Lakehouse Fundamentals Certificate/Badge not appearing

Hello! I just passed the Lakehouse Fundamentals Accreditation and I haven't received any badge or certificate for it. I understand that I need to go to credentials.databricks.com but it is not there. How long before it appears? Need help

  • 12661 Views
  • 17 replies
  • 10 kudos
Latest Reply
andy25
New Contributor II
  • 10 kudos

Hello can you help me, I have the same problem. I don't have the certificate available on credentials.databricks.com.

  • 10 kudos
16 More Replies
teja_7
by New Contributor
  • 538 Views
  • 0 replies
  • 0 kudos

DataSummit 2023

Happy to be part of the data summit 2023. Wondering if DBSQL is enabled for DLT tables

  • 538 Views
  • 0 replies
  • 0 kudos
Bollam
by New Contributor II
  • 521 Views
  • 0 replies
  • 0 kudos

Utility catalog and data governance

Attended the Data and AI Summit 2023 and gained insights into the utility catalog and services that it has to offer, definitely  going to try the data governance as it's a game changer. 

  • 521 Views
  • 0 replies
  • 0 kudos
Ankith
by New Contributor
  • 2806 Views
  • 1 replies
  • 1 kudos

Converting column of XML strings to column of Jsons

Hi,I want to convert column of XML strings to column of Json in PySpark., using withcolumn and xmltodict method as UDF, is giving Json with '=' instead of ':' in the dictionary. Please let me know if there is any alternative for this.

  • 2806 Views
  • 1 replies
  • 1 kudos
Latest Reply
DessertKid
New Contributor II
  • 1 kudos

To convert a column of XML strings to a column of JSON in PySpark, you can use the `from_json` function along with the `xmltodict` library. However, instead of using a UDF with `withColumn`, you can use the `select` function to transform the column.

  • 1 kudos
AlanF
by New Contributor II
  • 357 Views
  • 0 replies
  • 0 kudos

Great Summit

Having a great time at the community hub at the Summit. Highly recommend!

  • 357 Views
  • 0 replies
  • 0 kudos
Lakehouse
by New Contributor
  • 1758 Views
  • 2 replies
  • 0 kudos

Is Rust the future if analytics?

Today I walked into a session that talked about a fairly new language - Rust. The name can mislead you, I believe taking a look at the roots of how to best use CPU cycles is a game changer and Rust is traversing new areas that others might have ignor...

  • 1758 Views
  • 2 replies
  • 0 kudos
Latest Reply
Tom-Coffin
New Contributor II
  • 0 kudos

Yes, Rust is definitely part of the future.  It brings performance and simplicity to us.  I think it will add to the community, rather than replacing.  Scala and R will never go away, Python will always be strong, but Rust gives us one other tool in ...

  • 0 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels