cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Gilg
by Contributor II
  • 2955 Views
  • 1 replies
  • 0 kudos

Move files

HiI am using DLT with Autoloader.DLT pipeline is running in Continuous mode.Autoloader is in Directory Listing mode (Default)Question.I want to move files that has been processed by the DLT to another folder (archived) and planning to have another no...

  • 2955 Views
  • 1 replies
  • 0 kudos
Brad
by Contributor II
  • 4391 Views
  • 1 replies
  • 0 kudos

What is the behavior when merge key is not unique

Hi, When using the MERGE statement, if merge key is not unique on both source and target, it will throw error. If merge key is unique in source but not unique in target, WHEN MATCHED THEN DELETE/UPDATE should work or not? For example merge key is id....

  • 4391 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brad
Contributor II
  • 0 kudos

Cool, this is what I tested out. Great to get confirmed. Thanks. BTW, https://medium.com/@ritik20023/delta-lake-upserting-without-primary-key-f4a931576b0 has a workaround which can fix the merge with duplicate merge key on both source and target.

  • 0 kudos
Erik_L
by Contributor II
  • 2121 Views
  • 1 replies
  • 0 kudos

Visualizations failing to show

I have a SQL query that generates a table. I created a visualization from that table with the UI. I then have a widget that updates a value used in the query and re-runs the SQL, but then the visualization shows nothing, that there is "1 row," but if...

Screenshot from 2024-04-05 10-23-03.png
  • 2121 Views
  • 1 replies
  • 0 kudos
397973
by New Contributor III
  • 3675 Views
  • 3 replies
  • 0 kudos

Having trouble installing my own Python wheel?

I want to install my own Python wheel package on a cluster but can't get it working. I tried two ways: I followed these steps: https://docs.databricks.com/en/workflows/jobs/how-to/use-python-wheels-in-workflows.html#:~:text=March%2025%2C%202024,code%...

Data Engineering
cluster
Notebook
  • 3675 Views
  • 3 replies
  • 0 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 0 kudos

@397973 - Once you uploaded the .whl file, did you had a chance to list the file manually in the notebook?  Also, did you had a chance to move the files to /Volumes .whl file?  

  • 0 kudos
2 More Replies
SyedSaqib
by New Contributor II
  • 3415 Views
  • 1 replies
  • 0 kudos

Delta Live Table : [TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view

Hi,I have a delta live table workflow with storage enabled for cloud storage to a blob store.Syntax of bronze table in notebook===@dlt.table(spark_conf = {"spark.databricks.delta.schema.autoMerge.enabled": "true"},table_properties = {"quality": "bron...

  • 3415 Views
  • 1 replies
  • 0 kudos
Latest Reply
SyedSaqib
New Contributor II
  • 0 kudos

Hi Kaniz,Thanks for replying back.I am using python for delta live table creation, so how can I set these configurations?When creating the table, add the IF NOT EXISTS clause to tolerate pre-existing objects.consider using the OR REFRESH clause Answe...

  • 0 kudos
Henrique_Lino
by New Contributor II
  • 5839 Views
  • 6 replies
  • 0 kudos

value is null after loading a saved df when using specific type in schema

 I am facing an issue when using databricks, when I set a specific type in my schema and read a json, its values are fine, but after saving my df and loading again, the value is gone.I have this sample code that shows this issue: from pyspark.sql.typ...

  • 5839 Views
  • 6 replies
  • 0 kudos
Latest Reply
Lakshay
Databricks Employee
  • 0 kudos

@Henrique_Lino , Where are you saving your df?

  • 0 kudos
5 More Replies
Anandsingh
by New Contributor
  • 2155 Views
  • 1 replies
  • 0 kudos

Writing to multiple files/tables from data held within a single file through autoloader

I have a requirement to read and parse JSON files using autoloader where incoming JSON file has multiple sub entities. Each sub entity needs to go into its own delta table. Alternatively we can write each entity data to individual files. We can use D...

  • 2155 Views
  • 1 replies
  • 0 kudos
Latest Reply
Lakshay
Databricks Employee
  • 0 kudos

I think using DLT's medallion architecture should be helpful in this scenario. You can write all the incoming data to one bronze table and one silver table. And you can have multiple gold tables based on the value of the sub-entities.

  • 0 kudos
Kavi_007
by New Contributor III
  • 8937 Views
  • 6 replies
  • 1 kudos

Resolved! Seeing history even after vacuuming the Delta table

Hi,I'm trying to do the vacuum on a Delta table within a unity catalog. The default retention is 7 days. Though I vacuum the table, I'm able to see the history beyond 7 days. Tried restarting the cluster but still not working. What would be the fix ?...

  • 8937 Views
  • 6 replies
  • 1 kudos
Latest Reply
Kavi_007
New Contributor III
  • 1 kudos

No, that's wrong. VACUUM removes all files from the table directory that are not managed by Delta, as well as data files that are no longer in the latest state of the transaction log for the table and are older than a retention threshold.VACUUM - Azu...

  • 1 kudos
5 More Replies
Jon
by New Contributor II
  • 8737 Views
  • 4 replies
  • 5 kudos

IP address fix

How can I fix the IP address of my Azure Cluster so that I can whitelist the IP address to run my job daily on my python notebook? Or can I find out the IP address to perform whitelisting? Thanks

  • 8737 Views
  • 4 replies
  • 5 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 5 kudos

Depends on the scenario.  You could expose a single ip address to the external internet, but databricks itself will always use many addresses.

  • 5 kudos
3 More Replies
DE_K
by New Contributor II
  • 4417 Views
  • 4 replies
  • 0 kudos

@dlt.table throws error AttributeError: module 'dlt' has no attribute 'table'

Hi Everyone,I am new to DLT and am trying to run below code to create table dynamically. But I get error "AttributeError: module 'dlt' has no attribute 'table'". code snippet:def generate_tables(model_name   try:    spark.sql("select * from dlt.{0}"....

DE_K_0-1712148007690.png
Data Engineering
dataengineering
datapipeline
deltalivetables
dlt
  • 4417 Views
  • 4 replies
  • 0 kudos
Latest Reply
YuliyanBogdanov
New Contributor III
  • 0 kudos

Thank You, @DE_K. I see your point. I believe you are using the @dlt.table instead of @dlt.create_table to begin with, since want the table to be created and not define and existing one. (https://community.databricks.com/t5/data-engineering/differenc...

  • 0 kudos
3 More Replies
duliu
by New Contributor II
  • 2200 Views
  • 1 replies
  • 0 kudos

Spark Driver failed due to DRIVER_UNAVAILABLE but not due to memory pressure

Hello,I have a job cluster running streaming job and it unexpectedly failed on 19th March due to DRIVER_UNAVAILABLE (Request timed out, Driver is temporarily unavailable) in event log. This is the run: https://atlassian-discover.cloud.databricks.com/...

duliu_0-1712192893352.png duliu_1-1712192913524.png
  • 2200 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 0 kudos

Hi @duliu , Hope you are doing well! Would you kindly see if the KB article below addresses your problem? https://kb.databricks.com/en_US/jobs/driver-unavailable Please let me know if this helps and leave a like if this information is useful, followu...

  • 0 kudos
DumbBeaver
by New Contributor II
  • 1728 Views
  • 0 replies
  • 0 kudos

Issue while writing data to unity catalog using JDBC

While writing the data to a pre-existing table in the unity catalog using JDBC. it just writes the Delta of the data. Driver used: com.databricks:databricks-jdbc:2.6.36Lets say I have the table has rows:+-+-+ |a|b| +-+-+ |1|2| |3|4| and I am appendi...

Data Engineering
JDBC
spark
Unity Catalog
  • 1728 Views
  • 0 replies
  • 0 kudos
Leszek
by Contributor
  • 1330 Views
  • 0 replies
  • 0 kudos

[Delta Sharing - open sharing protocol] Token rotation

Hi, Do you have any experience of rotating Tokens in Delta Sharing automatically?There is an option to do that using CLI (Create and manage data recipients for Delta Sharing | Databricks on AWS). But what to do next? Sending new link to the token via...

  • 1330 Views
  • 0 replies
  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels