cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

alj_a
by New Contributor III
  • 1713 Views
  • 1 replies
  • 0 kudos

source db and target db in DLT

Hi,Thanks in advance.I am new in DLT, the scenario is i need to read the data from cloud storage(ADLS) and load it into my bronze table. and read it from bronz table -> do some DQ checks and load the cleaned data into my silver table. finally populat...

  • 1713 Views
  • 1 replies
  • 0 kudos
marianopenn
by New Contributor III
  • 3948 Views
  • 2 replies
  • 1 kudos

Databricks VSCode Extension Sync Timeout

I am using the databricks VSCode extension to sync my local repository to Databricks Workspaces. I have everything configured such that smaller syncs work fine, but a full sync of my repository leads to the following error:Sync Error: Post "https://<...

Data Engineering
dbx sync
Repos
VSCode
Workspaces
  • 3948 Views
  • 2 replies
  • 1 kudos
Latest Reply
kimongrigorakis
New Contributor II
  • 1 kudos

Same issue here..... Can someone please help??

  • 1 kudos
1 More Replies
Phani1
by Databricks MVP
  • 1206 Views
  • 0 replies
  • 0 kudos

Unity catalog accounts

Hi Team,We have the requirement to have metadata(Unity catalog) in one AWS account and data storage(Delta tables under data) in another account, is it possible to do that , Do we face any technical/Security issue??

  • 1206 Views
  • 0 replies
  • 0 kudos
278875
by New Contributor
  • 22970 Views
  • 4 replies
  • 1 kudos

How do I figure out the cost breakdown for Databricks

I'm trying to figure out the cost breakdown for the Databricks usage for my team.When I go into the Databricks administration console and click Usage when I select to show the usage By SKU it just displays the type of cluster but not the name of it. ...

  • 22970 Views
  • 4 replies
  • 1 kudos
Latest Reply
MuthuLakshmi
Databricks Employee
  • 1 kudos

Please check the below docs for usage related informations. The Billable Usage Logs: https://docs.databricks.com/en/administration-guide/account-settings/usage.html You can filter them using tags for more precise information which you are looking for...

  • 1 kudos
3 More Replies
Rafal9
by New Contributor III
  • 9283 Views
  • 0 replies
  • 0 kudos

Issue during testing SparkSession.sql() with pytest.

Dear Community,I am testing pyspark code via pytest using VS code and Databricks Connect.SparkSession is initiated from Databricks Connect: from databricks.connect import DatabricksSessionspark = DatabricksSession.builder.getOrCreate()I am  receiving...

  • 9283 Views
  • 0 replies
  • 0 kudos
svrdragon
by New Contributor
  • 3107 Views
  • 0 replies
  • 0 kudos

optimizeWrite takes too long

Hi , We have a spark job write data in delta table for last 90 date partition. We have enabled spark.databricks.delta.autoCompact.enabled and delta.autoOptimize.optimizeWrite. Job takes 50 mins to complete. In that logic takes 12 mins and optimizewri...

  • 3107 Views
  • 0 replies
  • 0 kudos
erigaud
by Honored Contributor
  • 11568 Views
  • 3 replies
  • 0 kudos

Merge DLT with Delta Table

Is there anyway to accomplish this ? I have an existing Delta Table and a separate Delta Live Table pipelines and I would like to merge data from a DLT to my existing Delta Table. Is this doable or completely impossible ?

  • 11568 Views
  • 3 replies
  • 0 kudos
Latest Reply
LeifBruen
New Contributor II
  • 0 kudos

Merging data from a Delta Live Table (DLT) into an existing Delta Table is possible with careful planning. Transition data from DLT to Delta Table through batch processing, data transformation, and ETL processes, ensuring schema compatibility. 

  • 0 kudos
2 More Replies
NotARobot
by New Contributor III
  • 2049 Views
  • 0 replies
  • 2 kudos

Force DBR/Spark Version in Delta Live Tables Cluster Policy

Is there a way to use Compute Policies to force Delta Live Tables to use specific Databricks Runtime and PySpark versions? While trying to leverage some of the functions in PySpark 3.5.0, I don't seem to be able to get Delta Live Tables to use Databr...

test_cluster_policy.png dlt_version.png
Data Engineering
Compute Policies
Delta Live Tables
Graphframes
pyspark
  • 2049 Views
  • 0 replies
  • 2 kudos
JohnJustus
by New Contributor III
  • 14758 Views
  • 1 replies
  • 0 kudos

Accessing Excel file from Databricks

Hi,I am trying to access excel file that is stored in Azure Blob storage via Databricks.In my understanding, it is not possible to access using Pyspark. So accessing through Pandas is the option,Here is my code.%pip install openpyxlimport pandas as p...

  • 14758 Views
  • 1 replies
  • 0 kudos
databicky
by Contributor II
  • 6865 Views
  • 3 replies
  • 1 kudos

No handler for udf/udaf/udtf for function

i created one function using jar file which is present in the cluster location, but when executing the hive query it is showing error as no handler for udf/udaf/udtf . this queries is running fine in hd insight clusters but when running in databricks...

IMG20231015164650.jpg
  • 6865 Views
  • 3 replies
  • 1 kudos
dbuser1234
by New Contributor
  • 3355 Views
  • 0 replies
  • 0 kudos

How to readstream from multiple sources?

Hi I am trying to readstream from 2 sources and join them into a target table. How can I do this in pyspark? Egt1 + t2 as my bronze table. I want to readstream from t1 and t2, and merge the changes into t3 (silver table)

  • 3355 Views
  • 0 replies
  • 0 kudos
Labels