cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Jotav93
by New Contributor II
  • 2560 Views
  • 2 replies
  • 1 kudos

Move a delta table from a non UC metastore to a UC metastore preserving history

Hi, I am using Azure databricks and we recently enabled UC in our workspace. We have some tables in our non UC metastore that we want to move to a UC enabled metastore. Is there any way we can move these tables without loosing the delta table history...

Data Engineering
delta
unity
  • 2560 Views
  • 2 replies
  • 1 kudos
Latest Reply
ThomazRossito
Contributor
  • 1 kudos

Hello,It is possible to have the expected result with dbutils.fs.cp("Origin location", "Destination location", True) and then create the table with the LOCATION of the Destination locationHope this helps

  • 1 kudos
1 More Replies
Dp15
by Contributor
  • 1762 Views
  • 1 replies
  • 1 kudos

Using UDF in an insert command

Hi,I am trying to use a UDF to get the last day of the month and use the boolean result of the function in an insert command. Please find herewith the function and the my query.function:import calendarfrom datetime import datetime, date, timedeltadef...

  • 1762 Views
  • 1 replies
  • 1 kudos
Latest Reply
Dp15
Contributor
  • 1 kudos

Thank you @Retired_mod for your detailed explanation

  • 1 kudos
Kroy
by Contributor
  • 15215 Views
  • 7 replies
  • 1 kudos

Resolved! What is difference between streaming and streaming live table

Can anyone explain in layman what is difference between Streaming and streaming live table ?

  • 15215 Views
  • 7 replies
  • 1 kudos
Latest Reply
CharlesReily
New Contributor III
  • 1 kudos

Streaming, in a broad sense, refers to the continuous flow of data over a network. It allows you to watch or listen to content in real-time without having to download the entire file first.  A "Streaming Live Table" might refer to a specific type of ...

  • 1 kudos
6 More Replies
kiko_roy
by Contributor
  • 10503 Views
  • 3 replies
  • 1 kudos

Permission error loading dataframe from azure unity catalog to GCS bucket

I am creating a data frame by reading a table's data residing in Azure backed unity catalog. I need to write the df or file to GCS bucket. I have configured the spark cluster config using the GCP service account json values.on running : df1.write.for...

Data Engineering
GCS bucket
permission error
  • 10503 Views
  • 3 replies
  • 1 kudos
Latest Reply
ruloweb
New Contributor II
  • 1 kudos

Hi, is there any terraform resource to apply this GRANT or this have to be done always manually?

  • 1 kudos
2 More Replies
leireroman
by New Contributor III
  • 1185 Views
  • 1 replies
  • 0 kudos

Bootstrap Timeout during job cluster start

My job was not able to start because I got this problem in the job cluster.This job is running on a Azure Databricks workspace that has been deployed for almost a year and I have not had this error before. It is deployed in North Europe.After getting...

leireroman_0-1713160992292.png
  • 1185 Views
  • 1 replies
  • 0 kudos
Latest Reply
lukasjh
New Contributor II
  • 0 kudos

We have the same problem randomly occurring since yesterday in two workspaces.The cluster started fine today in the morning at 08:00, but failed again from around 09:00 on. 

  • 0 kudos
Anske
by New Contributor III
  • 4820 Views
  • 1 replies
  • 1 kudos

One-time backfill for DLT streaming table before apply_changes

Hi,absolute Databricks noob here, but I'm trying to set up a DLT pipeline that processes cdc records from an external sql server instance to create a mirrored table in my databricks delta lakehouse. For this, I need to do some initial one-time backfi...

Data Engineering
Delta Live Tables
  • 4820 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anske
New Contributor III
  • 1 kudos

So since nobody responded, I decided to try my own suggestion and hack the snapshot data into the table that gathers the change data capture. After some straying I ended up with the notebook as attached.The notebook first creates 2 dlt tables (lookup...

  • 1 kudos
cubanDataDude
by New Contributor II
  • 1225 Views
  • 1 replies
  • 1 kudos

Job Claiming NotebooKNotFound Incorrectly (seemingly)

I have the code captured below in the screenshot. When I run this individually it works just fine, when I JOB runs this it fails out with 'ResourceNotFound' - not sure what the issue is... - Checked 'main' branch, which is where this job is pulling f...

  • 1225 Views
  • 1 replies
  • 1 kudos
Latest Reply
cubanDataDude
New Contributor II
  • 1 kudos

Figured it out:ecw_staging_nb_List = ['nb_UPSERT_stg_ecw_insurance','nb_UPSERT_stg_ecw_facilitygroups']Works just fine.

  • 1 kudos
jp_allard
by New Contributor
  • 2068 Views
  • 0 replies
  • 0 kudos

Selective Overwrite to a Unity Catalog Table

I have been able to perform a selective overwrite using replace Where to a hive_metastore table, but when I use the same code for the same table in a unity catalog, no data is written.Has anyone else had this issue or is there common mistakes that ar...

  • 2068 Views
  • 0 replies
  • 0 kudos
dannythermadom
by New Contributor III
  • 6352 Views
  • 6 replies
  • 7 kudos

Dbutils.notebook.run command not working with /Repos/

I have two github repo configured in Databricks Repos folder. repo_1 is run using a job and repo_2 is run/called from repo_1 using Dbutils.notebook.run command. dbutils.notebook.run("/Repos/repo_2/notebooks/notebook", 0, args)i am getting the follo...

  • 6352 Views
  • 6 replies
  • 7 kudos
Latest Reply
cubanDataDude
New Contributor II
  • 7 kudos

I am having a similar issue...  ecw_staging_nb_List = ['/Workspace/Repos/PRIMARY/UVVC_DATABRICKS_EDW/silver/nb_UPSERT_stg_ecw_insurance',                 '/Repos/PRIMARY/UVVC_DATABRICKS_EDW/silver/nb_UPSERT_stg_ecw_facilitygroups'] Adding workspace d...

  • 7 kudos
5 More Replies
Jennifer
by New Contributor III
  • 903 Views
  • 0 replies
  • 0 kudos

Optimization failed for timestampNtz

We have a table using timestampNtz type for timestamp, which is also a cluster key for this table using liquid clustering. I ran OPTIMIZE <table-name>, it failed with errorUnsupported datatype 'TimestampNTZType' But the failed optmization also broke ...

  • 903 Views
  • 0 replies
  • 0 kudos
pragarwal
by New Contributor II
  • 3552 Views
  • 2 replies
  • 0 kudos

Export Users and Groups from Unity Catalog

Hi,I am trying to export the list of users and groups from Unity catalog through databricks workspace but i am seeing only the users/groups created inside the workspace instead of the groups and users coming through scim in unity catalog.How can i ge...

  • 3552 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Hello when you refer to the users and groups in Unity Catalog, do you refer to the ones created at the Account Level?If this is the case you need to run the API call at the account level and not workspace level, you can see the API doc for account le...

  • 0 kudos
1 More Replies
Jorge3
by New Contributor III
  • 2689 Views
  • 1 replies
  • 0 kudos

Trigger a job on file update

I'm using AutoLoader to process any new file or update that arrives to my landing area. And then I schedule the job using DB workflows to trigger on file arrival. The issue is that the trigger only executes when new files arrive, not when an exiting ...

  • 2689 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ivan_Donev
New Contributor III
  • 0 kudos

I don't think you can effectively achieve your goal. While it's theoretically somewhat possible, Databricks documentation says there is no guarantee for correctness - Auto Loader FAQ | Databricks on AWS

  • 0 kudos
Anonymous
by Not applicable
  • 8564 Views
  • 2 replies
  • 1 kudos

When reading a csv file with Spark.read, the data is not loading in the appropriate column while pas

I am trying to read a csv file from storage location using spark.read function. Also, i am explicitly passing the schema to the function. However, the data is not loading in proper column of the dataframe. Following are the code details:from pyspark....

  • 8564 Views
  • 2 replies
  • 1 kudos
Latest Reply
sai_sathya
New Contributor III
  • 1 kudos

Hi , i would suggest to approach as suggested by Thomaz Rossito,but maybe you can give it as an try like swapping the struct field order like this followingschema = StructType([StructField('DA_RATE', DateType(), True),StructField('CURNCY_F', StringTy...

  • 1 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels