cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

RyanHager
by Contributor
  • 2991 Views
  • 5 replies
  • 2 kudos

Are there any plans to add functions on the partition by fields of a delta table definition such as day() ? A similar capability exists in iceberg.

Benefit: This will help simplify the where clauses of the consumers of the tables? Just query on the main date field if I need all the data for a day. Not an extra day field we had to make.

  • 2991 Views
  • 5 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

@Ryan Hager​ , yes it is possible using AUTO GENERATED COLUMNS since delta lake 1.2For example, you can automatically generate a date column (for partitioning the table by date) from the timestamp column; any writes into the table need only specify t...

  • 2 kudos
4 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 1607 Views
  • 0 replies
  • 5 kudos

Notebook cell output results limit increased- 10,000 rows or 2 MB. Hi all, Now, databricks start showing the first 10000 rows instead of 1000 rows.Tha...

Notebook cell output results limit increased- 10,000 rows or 2 MB.Hi all,Now, databricks start showing the first 10000 rows instead of 1000 rows.That will reduce the time of re-execution while working on fewer sizes of data that have rows between 100...

  • 1607 Views
  • 0 replies
  • 5 kudos
hare
by New Contributor III
  • 2832 Views
  • 4 replies
  • 3 kudos

Implementation of Late arriving dimension in databricks

Hi Team, Can you please suggest to me how to implement the late arriving dimension or early arriving fact with examples or any sample script for reference? I have to implement the same using pyspark.Thanks.

  • 2832 Views
  • 4 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Hare Krishnan​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

  • 3 kudos
3 More Replies
none_ranjeet
by New Contributor III
  • 2992 Views
  • 3 replies
  • 2 kudos

Resolved! Passed the Fundamentals of the Databricks Lakehouse Platform Accreditation, but no badge recieved. Tried "https://v2.accounts.accredible.com/retrieve-credentials?" showing no badge.

Passed the Fundamentals of the Databricks Lakehouse Platform Accreditation, but no badge recieved. Tried "https://v2.accounts.accredible.com/retrieve-credentials?" showing no badge. 

  • 2992 Views
  • 3 replies
  • 2 kudos
Latest Reply
Chaitanya_Raju
Honored Contributor
  • 2 kudos

Hi @Ranjeet Ahlawat​ ,Congratulations on the certification. For any certification you take in the databricks you will be receiving the certificate and the badge in 24-48 hours and sometimes in lesser time as well. All the best for your future certifi...

  • 2 kudos
2 More Replies
asami34
by New Contributor II
  • 3583 Views
  • 7 replies
  • 0 kudos

Cannot reset password, no support

I cannot log in to my Databricks community account. I have already tried to receive support and no real support has been given. I attempt to reset my password, the link gets sent, but once I enter the new password it gets stuck permanently loading. I...

  • 3583 Views
  • 7 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Ahmet Korkmaz​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

  • 0 kudos
6 More Replies
sujai_sparks
by New Contributor III
  • 15410 Views
  • 14 replies
  • 15 kudos

Resolved! How to convert records in Azure Databricks delta table to a nested JSON structure?

Let's say I have a delta table in Azure databricks that stores the staff details (denormalized).  I wanted to export the data in the JSON format and save it as a single file on a storage location. I need help with the databricks sql query to group/co...

2023-02-24 22_08_34-MyTest - Databricks
  • 15410 Views
  • 14 replies
  • 15 kudos
Latest Reply
NateAnth
Databricks Employee
  • 15 kudos

Glad it worked for you!!

  • 15 kudos
13 More Replies
Shanthala
by New Contributor III
  • 1566 Views
  • 3 replies
  • 3 kudos

Where is the learning material to get Fundamentals of the Databricks Lakehouse Platform Accreditation?

Please provide me some information about how to get the martial to pass Fundamentals of the Databricks Lakehouse Platform Accreditation?

  • 1566 Views
  • 3 replies
  • 3 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 3 kudos

Hi @Shanthala Baleer​,Just a friendly follow-up. Are you still looking for help? adding @Vidula Khanna​ for visibility

  • 3 kudos
2 More Replies
DavidMayer-Foul
by New Contributor II
  • 1036 Views
  • 1 replies
  • 0 kudos

How to restart snowflake connector?

After using spark.read.format("snowflake").options(**options).option("dbtable", "table_name").load() to read a table from Snowflake, when I then change the table from Snowflake and read it again, it gives me the first version of the table. I have wor...

  • 1036 Views
  • 1 replies
  • 0 kudos
Latest Reply
DavidMayer-Foul
New Contributor II
  • 0 kudos

Yes, that would work. However, it is a longish Snowflake query producing a number of tables that are all called by the Databricks notebook, so it requires quite a few changes. I'll use this alternative if I automate the process. However, I think this...

  • 0 kudos
EmilioGC
by New Contributor III
  • 4982 Views
  • 5 replies
  • 7 kudos

Resolved! Why was SQL formatting removed inside spark.sql functions? Now it looks like a plain string.

Previously we were able to see SQL queries inside spark.sql() like this:But now it just looks like a plain string: I know it's not a big issue, but it's still annoying to have to code in SQL while having it all be blue, it makes debugging more cumber...

old format new format
  • 4982 Views
  • 5 replies
  • 7 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 7 kudos

Hi @Emilio Garza​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 7 kudos
4 More Replies
Kash
by Contributor III
  • 3295 Views
  • 4 replies
  • 0 kudos

Creating a spot only single-node job compute cluster policy

Hi there,I need some help creating a new cluster policy that utilizes a single spot-instnace server to complete a job. I want to set this up as a job-compute to reduce costs and also utilize 1 spot instance.The jobs I need to ETL are very short and c...

  • 3295 Views
  • 4 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Hi @Avkash Kana​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 0 kudos
3 More Replies
databicky
by Contributor II
  • 1866 Views
  • 4 replies
  • 0 kudos

how to optimize the runtime in 10.4 cluster

i am loading the 1billion data from spark dataframe into target table, but in the 7.3 cluster it takes 3 hours to complete but after migrated to 10.4 cluster its taking 8 hours to complete , how can i reduce the time duration​

  • 1866 Views
  • 4 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Hi @Mohammed sadamusean​,Could you provide more details on what are you doing? What type of transformations/actions are you doing? whats your source and sink? batch or streaming? all that information will help.

  • 0 kudos
3 More Replies
RafikiT97
by New Contributor
  • 3292 Views
  • 3 replies
  • 0 kudos

Query Databricks from Power BI with Row Level Security

I am trying to apply RLS to the solution but Power BI only connects to Databricks(DB) using a token which cant be used in DB groups. Is there no other way to apply Row Level security using Power BI?

  • 3292 Views
  • 3 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Hi @Daniel Gomes​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 0 kudos
2 More Replies
farooqurrehman
by New Contributor
  • 1999 Views
  • 3 replies
  • 2 kudos

Unable to connect/read files from ADLS Gen2 using account key

It gives error[RequestId=5e57b66f-b69f-4e8b-8706-3fe5baeb77a0 ErrorClass=METASTORE_DOES_NOT_EXIST] No metastore assigned for the current workspace.using the following codespark.conf.set(  "fs.azure.account.key.mystorageaccount.dfs.core.windows.net", ...

  • 1999 Views
  • 3 replies
  • 2 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 2 kudos

Hi @Farooq ur rehman​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 2 kudos
2 More Replies
SRK
by Contributor III
  • 2770 Views
  • 5 replies
  • 0 kudos

Delta Live Tables data quality rules application.

I have a requirement, where I need to apply inverse DQ rule on a table to track the invalid data. For which I can use the following approach:import dltrules = {}quarantine_rules = {}rules["valid_website"] = "(Website IS NOT NULL)"rules["valid_locatio...

  • 2770 Views
  • 5 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

You can get additional info from DLT event log which is in delta so you can load it as table https://docs.databricks.com/workflows/delta-live-tables/delta-live-tables-event-log.html#data-quality

  • 0 kudos
4 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 924 Views
  • 1 replies
  • 10 kudos

Since databricks runtime 12.1 "WHEN NOT MATCHED BY SOURCE" was added to MERGE syntax. For example, using that option, we can quickly delete ...

Since databricks runtime 12.1 "WHEN NOT MATCHED BY SOURCE" was added to MERGE syntax. For example, using that option, we can quickly delete all target rows which doesn't match any source.

Screenshot 2023-01-24 130504
  • 924 Views
  • 1 replies
  • 10 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 10 kudos

Thank you for sharing @Hubert Dudek​ 

  • 10 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels