cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Nick_Pacey
by New Contributor III
  • 394 Views
  • 0 replies
  • 0 kudos

Foreign Catalog error connecting to SQL Server 2008 R2

Hi,Is there a limitation or know issue when creating a foreign catalog to a SQL Server 2008 R2?We are successfully able to connect to this SQL Server through a JDBC connection string.  To make this work, we have to switch the Java encrypt flag to fal...

  • 394 Views
  • 0 replies
  • 0 kudos
messiah
by New Contributor II
  • 2111 Views
  • 5 replies
  • 0 kudos

How to Create Iceberg Tables in Databricks Using Parquet Files from S3?

Hi Databricks Community,I’m trying to create Apache Iceberg tables in Databricks using Parquet files stored in an S3 bucket. I found a guide from Dremio, but I’m unable to create Iceberg tables using that method.Here’s what I need:Read Parquet files ...

  • 2111 Views
  • 5 replies
  • 0 kudos
Latest Reply
Raashid_Khan
New Contributor II
  • 0 kudos

How to create/insert in databricks tables for iceberg format? I have iceberg parquets in gcs and want to store them as iceberg tables in databricks catalogs.

  • 0 kudos
4 More Replies
MuesLee
by New Contributor
  • 314 Views
  • 1 replies
  • 0 kudos

Merge rewrites many unmodified files

Hello. I want to do a merge on a subset of my delta table partitions to do incremental upserts to keep two tables in sync. I do not use a whenNotMatchedBySource statement to clean up stale rows in my target because of this GitHub IssueBecause of that...

  • 314 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Honored Contributor III
  • 0 kudos

Hi MuesLee,How are you doing today?, as per my understanding, Yes, your understanding is mostly correct. The reason even unchanged partitions are being rewritten is likely because of how Delta Lake’s merge operation handles partition pruning and upda...

  • 0 kudos
code_vibe
by New Contributor
  • 325 Views
  • 1 replies
  • 0 kudos

Delta lake federated table not working as expected

I’m facing an issue while working with federated Redshift tables in Databricks, and I’m hoping someone here can help me out.I have a source table(material) in Redshift that I’m querying through the Delta lake federation in Databricks. when I run the ...

  • 325 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Honored Contributor III
  • 0 kudos

Hi Code_Vide,How are you doing today?, As per my understanding, It looks like the issue might be due to predicate pushdown not happening when querying the federated Redshift table in Databricks. Predicate pushdown helps filter data at the source (Red...

  • 0 kudos
Jorge3
by New Contributor III
  • 350 Views
  • 1 replies
  • 0 kudos

Too many small files in the "landing area"

Hello everyone,I’m currently working on a setup where my unprocessed real-time data arrives as .json files in Azure Data Lake Storage (ADLS). Every x minutes, I use Databricks Autoloader to pick up the new data, run my ETL transformations, and store ...

  • 350 Views
  • 1 replies
  • 0 kudos
Latest Reply
koji_kawamura
Databricks Employee
  • 0 kudos

Hi @Jorge3 Since you mentioned the "cloudFiles.useNotifications" option, I assume you know AutoLoader's File Detection Mode. It should be the best solution to your situation. Have you tried it already and encountered an issue? If so, please let us kn...

  • 0 kudos
ZacayDaushin
by New Contributor
  • 2003 Views
  • 2 replies
  • 0 kudos

How to access system.access.table_lineage

I try to make a select from system.access.table_lineage but i dont have to see the tablewhat permission to i have 

  • 2003 Views
  • 2 replies
  • 0 kudos
Latest Reply
Nivethan_Venkat
Contributor
  • 0 kudos

Hi @ZacayDaushin,To query the table in system catalog, you need to have SELECT permission on top of the table to query and see the results.Best Regards,Nivethan V

  • 0 kudos
1 More Replies
Kayla
by Valued Contributor II
  • 456 Views
  • 4 replies
  • 3 kudos

Unity Catalog "Sync" Question

I'm having a little trouble fully following the documentation on the SYNC command.I have a table in hive_metastore that still needs to be able to be updated daily for the next few months, but I also need to define a view in Unity Catalog based on tha...

  • 456 Views
  • 4 replies
  • 3 kudos
Latest Reply
Nivethan_Venkat
Contributor
  • 3 kudos

Hi @Kayla,SYNC command is to sync your hive EXTERNAL table to your Unity Catalog name space. If the table is external, the UC table will be in sync with your external location. If it is hive managed table, you can't use SYNC command to have your mana...

  • 3 kudos
3 More Replies
the_dude
by New Contributor II
  • 656 Views
  • 1 replies
  • 0 kudos

Impossibility to have multiple versions of the same Python package installed

Hello, We package our Spark jobs + utilities in a custom package to be used in wheel tasks in Databricks. In my opinion, having several versions of this job (say "production" and "dev") run on the same cluster against different versions of this custo...

  • 656 Views
  • 1 replies
  • 0 kudos
Latest Reply
the_dude
New Contributor II
  • 0 kudos

If someone comes across this post - as per documentation, library/package installation can be Notebook-scoped. Thus, in order to overcome the limitation described in the initial post instead we are experimenting with Notebook tasks whose only respons...

  • 0 kudos
Phani1
by Valued Contributor II
  • 374 Views
  • 1 replies
  • 0 kudos

Reading Multiple Data Formats

 Hi All, I'm looking to develop generic code that can read multiple data formats, such as Parquet, Delta, Iceberg and save it as delta. Can you provide some insights or guidance on how to achieve this?Regards,Phani

  • 374 Views
  • 1 replies
  • 0 kudos
Latest Reply
Erika_Fonseca
Databricks Employee
  • 0 kudos

Take a look at these 2 projects that follow a metadata-driven approach: Lakehouse Engine DLT Meta

  • 0 kudos
zmsoft
by Contributor
  • 544 Views
  • 1 replies
  • 0 kudos

How to copy file from UC volume to external location folder

Hi there, How to copy file from UC volume to external location folder Thanks&Regards, zmsoft

  • 544 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika_
Databricks Employee
  • 0 kudos

Hello @zmsoft! To copy a file from a UC volume to an external location, you can use:   dbutils.fs.cp( "UC_volume_path", "external_location_path" )   Ensure the external location is preconfigured in Unity Catalog and you have the necessary permission...

  • 0 kudos
Eduard
by New Contributor II
  • 115582 Views
  • 3 replies
  • 1 kudos

Cluster xxxxxxx was terminated during the run.

Hello,I have a problem with the autoscaling of a cluster. Every time the autoscaling is activated I get this error. Does anyone have any idea why this could be?"Cluster xxxxxxx was terminated during the run (cluster state message: Lost communication ...

  • 115582 Views
  • 3 replies
  • 1 kudos
Latest Reply
louisgarza
New Contributor II
  • 1 kudos

Hello Databricks Community,The error message indicates that the driver node was lost, which can happen due to network issues or malfunctioning instances. Here are a few possible reasons and solutions:Instance Instability: If your cloud provider has u...

  • 1 kudos
2 More Replies
Fatimah-Tariq
by New Contributor III
  • 335 Views
  • 4 replies
  • 0 kudos

Schema update Issue in DLT

I have a pipeline in databricks with this flowSQL SERVER (Source) -> Staging (Parquet) -> Bronze (DLT) -> Silver(DLT) -> Gold (DLT)The pipeline is up and running smoothly for months but recently, there was a schema update at my source level and one o...

  • 335 Views
  • 4 replies
  • 0 kudos
Latest Reply
Fatimah-Tariq
New Contributor III
  • 0 kudos

Hi @Alberto_Umana, is there any word on how to fix my data and bring all the records back to the pipeline schema?

  • 0 kudos
3 More Replies
mjar
by New Contributor III
  • 4226 Views
  • 9 replies
  • 3 kudos

ModuleNotFoundError when using foreachBatch on runtime 14 with Unity

Recently we have run into an issue using foreachBatch after upgrading our Databricks cluster on Azure to a runtime version 14 with Spark 3.5 with Shared access mode and Unity catalogue.The issue was manifested by ModuleNotFoundError error being throw...

  • 4226 Views
  • 9 replies
  • 3 kudos
Latest Reply
Abond
New Contributor II
  • 3 kudos

Hi, Any news regarding that issue? I have the same one on job cluster with 15.4 LTS when using asset bundles with foreachBatch  in .py file and call it from notebook. When the same code is located in notebook - it works file.(prep_silver_df(bronze_ta...

  • 3 kudos
8 More Replies
cpayne_vax
by New Contributor III
  • 20035 Views
  • 15 replies
  • 9 kudos

Resolved! Delta Live Tables: dynamic schema

Does anyone know if there's a way to specify an alternate Unity schema in a DLT workflow using the @Dlt.table syntax? In my case, I’m looping through folders in Azure datalake storage to ingest data. I’d like those folders to get created in different...

  • 20035 Views
  • 15 replies
  • 9 kudos
Latest Reply
abhishek_02
New Contributor II
  • 9 kudos

Hi @kuldeep-in, Could you please provide the exact location how to disable DPM enabled option as i was not able to locate it in pipeline settings or Databricks settings.Thank you

  • 9 kudos
14 More Replies
zmsoft
by Contributor
  • 366 Views
  • 1 replies
  • 0 kudos

Resolved! How do I use the azure databricks dlt pipeline to consume azure Event Center data

Hi there, How do I use the azure databricks dlt pipeline to consume azure Event Hub dataCode :TOPIC = "myeventhub" KAFKA_BROKER = "" GROUP_ID = "group_dev" raw_kafka_events = (spark.readStream .format("kafka") .option("subscribe", EH_NAME) .opt...

  • 366 Views
  • 1 replies
  • 0 kudos
Latest Reply
ashraf1395
Honored Contributor
  • 0 kudos

Hi there @zmsoft ,Did you have a look at this ref doc : https://docs.databricks.com/aws/en/dlt/event-hubsThis might help

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels