cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

amitpm
by New Contributor
  • 820 Views
  • 1 replies
  • 0 kudos

Lakeflow Connect - Column filtering

Hi community , I am interested in learning more about the feature that was mentioned in recent summit about query pushdown in lakeflow connect for SQL server. I believe this feature will allow to select only the required columns from source tables. I...

  • 820 Views
  • 1 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor III
  • 0 kudos

Hey @amitpm According to the documentation, this feature is currently in Public Preview, so if your Databricks account has access to public preview features, you can reach out to support to enable it and start testing performance.Setup guide for Lake...

  • 0 kudos
SenthilJ
by New Contributor III
  • 6482 Views
  • 2 replies
  • 1 kudos

Databricks Deep Clone

Hi,I am working on a DR design for Databricks in Azure. The recommendation from Databricks is to use Deep Clone to clone the Unity Catalog tables (within or across catalogs). My design is to ensure that DR is managed across different regions i.e. pri...

Data Engineering
Disaster Recovery
Unity Catalog
  • 6482 Views
  • 2 replies
  • 1 kudos
Latest Reply
Isi
Honored Contributor III
  • 1 kudos

Hi,In my opinion, Databricks Deep Clone does not currently support cloning Unity Catalog tables natively across different metastores (each region having its own metastore). Deep Clone requires that both source and target belong to the same metastore ...

  • 1 kudos
1 More Replies
arun_6482
by New Contributor
  • 2858 Views
  • 1 replies
  • 0 kudos

NPIP_TUNNEL_SETUP_FAILURE

Hello Databricks team,I have configure databricks in AWS , but while creating cluster getting below error . Could you please to fix this issue ?error :VM setup failed due to Ngrok setup timeout. Please check your network configuration and try again o...

  • 2858 Views
  • 1 replies
  • 0 kudos
Latest Reply
mani_22
Databricks Employee
  • 0 kudos

@arun_6482 The error you have shared suggests that there is a network issue in your Databricks deployment within your AWS account. Please review the documentation provided below and ensure that all your routes and ports are configured correctly. Doc:...

  • 0 kudos
kavithai
by New Contributor II
  • 1288 Views
  • 3 replies
  • 2 kudos
  • 1288 Views
  • 3 replies
  • 2 kudos
Latest Reply
Isi
Honored Contributor III
  • 2 kudos

Hey @kavithai Sometimes there are limitations in the laws of each country regarding "sharing" data outside private clouds or regions, which make it impossible to transmit data outside of your private networks. This is especially true for banks, which...

  • 2 kudos
2 More Replies
noorbasha534
by Valued Contributor II
  • 765 Views
  • 1 replies
  • 0 kudos

Global INIT script on sql warehouse

Dear allIs it possible to configure global INIT script on sql warehouse? If not, how can I achieve my below requirement-For example, this script will have 2 key and value pairs defined : src_catalog_name=ABC, tgt_catalog_name=DEFI want these 2 be ref...

  • 765 Views
  • 1 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor III
  • 0 kudos

hello @noorbasha534 Unfortunately, in SQL Warehouses, you can't attach an init script that automatically runs when the warehouse starts (similar to what you can do with clusters).However, there are a few alternatives you can consider:Session Variable...

  • 0 kudos
AliviaB
by New Contributor
  • 732 Views
  • 1 replies
  • 0 kudos

Authorization Issue while creating first Unity catalog table

 Hi All,We are setting up our new UC enabled databricks workspace. We have completed the metastore setup for our workspace and we have created new catalog and schema. But while creating a table we are getting authorization issue. Below is the table s...

  • 732 Views
  • 1 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

Are there locations specified for the catalog/table/schema? Or do you keep these at defaults?  Also, do you have a storage credential and external location set for mystorageaccount/mycontainer?

  • 0 kudos
lucami
by Contributor
  • 2738 Views
  • 1 replies
  • 0 kudos

Resolved! Understanding dropDuplicates in Delta Live Tables (DLT) with Photon

Hi everyone,I've been working with Delta Live Tables (DLT) in Databricks, and I'm particularly interested in understanding how the dropDuplicates function works when using the Photon engine. Photon is known for its columnar data processing capabiliti...

plan.png
  • 2738 Views
  • 1 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

FIRST() never stitches together values from different rows.When Photon executes dropDuplicates, it deterministically chooses one complete row for each set of duplicate keys and returns every column from that same row. If you ever encounter a result w...

  • 0 kudos
surajitDE
by Contributor
  • 1252 Views
  • 2 replies
  • 1 kudos

Resolved! How to Enable Sub-300 Millisecond Real-Time Mode in Delta Live Tables (DLT)

Hi folks,During the recent Data + AI Summit, there was a mention of a new real-time streaming mode in Delta Live Tables (DLT) that enables sub-300 millisecond latency. This sounds really promising!Could someone please guide me on:How do we enable thi...

  • 1252 Views
  • 2 replies
  • 1 kudos
Latest Reply
cgrant
Databricks Employee
  • 1 kudos

Real-time mode, right now, is in private preview. Reach out to your account team for enablement. It's separate from pipelines.trigger.interval. The engine is the same, just a different mode within it.

  • 1 kudos
1 More Replies
pargit2
by New Contributor II
  • 1504 Views
  • 5 replies
  • 0 kudos

dlt vs delta table

Hi,I'm building gold layer and silver layers. in bronze I ingest using auto loader.  data is getting updated once a month. should I save the fd in silver notebooks using delta live table or delta table? in the past I used simple: df.write.save("s3.."...

  • 1504 Views
  • 5 replies
  • 0 kudos
Latest Reply
nayan_wylde
Esteemed Contributor II
  • 0 kudos

I would say if the data is not complex and you are not handling any DQ checks in the pipeline the go for a regular databricks workflow and save it as delta table since you are refreshing the data every 1 month and it is not streaming workload.

  • 0 kudos
4 More Replies
taruntarun1345
by New Contributor
  • 2937 Views
  • 1 replies
  • 0 kudos

cluster creation

Hey all, I am facing an issue in creating a cluster. I can only see the SQL warehouse and its server creation. But I need to create a cluster to work on a data engineering project.

  • 2937 Views
  • 1 replies
  • 0 kudos
Latest Reply
jameshughes
Databricks Partner
  • 0 kudos

A couple of things to explore here as it can be solved a couple of different ways.1. A workspace admin needs to update your Entitlements to allow for cluster creation.  This is generally not a best practice as it can lead to unmanaged cluster sprawl....

  • 0 kudos
Monteiro_12
by New Contributor II
  • 784 Views
  • 1 replies
  • 0 kudos

How to Add a Certified Tag to a Table Using a DLT Pipeline

Is there a table property or configuration that allows me to add a certified tag directly to a table when using a Delta Live Tables pipeline?

  • 784 Views
  • 1 replies
  • 0 kudos
Latest Reply
SP_6721
Honored Contributor II
  • 0 kudos

Hi @Monteiro_12 ,As far as I know, DLT pipeline doesn’t support adding a certified tag directly through table properties or pipeline configurations. Tags like system.Certified needs to be applied manually after the table is created via SQL

  • 0 kudos
samgon
by New Contributor III
  • 4682 Views
  • 4 replies
  • 4 kudos

Resolved! study materials for Certified Data Engineer Professional Certification?

Can anyone recommend high-quality study materials or resources (courses, documentation, practice exams, etc.) that helped you prepare for the Professional-level exam?

Data Engineering
dataengineering
  • 4682 Views
  • 4 replies
  • 4 kudos
Latest Reply
samgon
New Contributor III
  • 4 kudos

Thanks alot for the suggestion, much appreciated , I already pass the associate exam!

  • 4 kudos
3 More Replies
HariPrasad1
by Databricks Partner
  • 1191 Views
  • 2 replies
  • 0 kudos

Unable to create log files using logging.basicConfig()

When I run this code below, I am not able to see the file under the path specified:import logginglogger = logging.getLogger(__name__)logging.basicConfig(filename='/Volumes/d_use1_ach_dbw_databricks1/default/ach_elegibility_raw/logs/example.log', enco...

  • 1191 Views
  • 2 replies
  • 0 kudos
Latest Reply
Yogesh_Verma_
Contributor II
  • 0 kudos

The issue is happening because you're calling logging.getLogger(__name__) before setting up logging.basicConfig(). When the logger is created too early, it doesn't know about the file handler, so it doesn't write to the file.To fix this, make sure yo...

  • 0 kudos
1 More Replies
Datamate
by New Contributor
  • 974 Views
  • 2 replies
  • 0 kudos

Databricks Connecting to ADLS Gen2 vs Azure SQL

What is the best approach to connect Databricks with Azure SQL or connect Databricks with ADLS Gen2.I am designing the system where I am planning to Integrate Databricks to Azure.May someone share experience Pros and cons of approach and best practic...

  • 974 Views
  • 2 replies
  • 0 kudos
Latest Reply
kavithai
New Contributor II
  • 0 kudos

Use Azure SQL Spark Connector. This method allows Databricks to read from and write to Azure SQL Database efficiently, supporting both bulk operations and secure authentication.Azure sql : Install connector, configure JDBC, use Key Vault, set permiss...

  • 0 kudos
1 More Replies
Labels