cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Faisal
by Contributor
  • 1180 Views
  • 1 replies
  • 0 kudos

DLT bronze tables

I am trying to ingest incremental parquet files data to bronze streaming table, how much history data should be retained ideally in bronze layer as a general best practise considering I will be only using bronze to ingest source data and move it to s...

  • 1180 Views
  • 1 replies
  • 0 kudos
Latest Reply
MuthuLakshmi
Databricks Employee
  • 0 kudos

The amount of history data that should be retained in the bronze layer depends on your specific use case and requirements. As a general best practice, you should retain enough history data to support your downstream analytics and machine learning wor...

  • 0 kudos
CaptainJack
by New Contributor III
  • 5818 Views
  • 1 replies
  • 0 kudos

Workspace API

Hello friends. I am having problem with Workspace API. I have many folders inside my /Workspace (200+) which I would like to copy my Program, whole Program folder, which includes 20 spark scripts are Databricks notebooks. I tried Workspace API and I ...

  • 5818 Views
  • 1 replies
  • 0 kudos
Latest Reply
CaptainJack
New Contributor III
  • 0 kudos

I am using this as api = /api/2.0/workspace/import

  • 0 kudos
Immassive
by New Contributor II
  • 1648 Views
  • 1 replies
  • 0 kudos

Reading information_schema tables through JDBC connection

Hi, I am using Unity Catalog as storage for data. I have an external system that establishes connection to Unity Catalog via a JDBC connection using the Databricks driver:Configure the Databricks ODBC and JDBC drivers - Azure Databricks | Microsoft L...

  • 1648 Views
  • 1 replies
  • 0 kudos
Latest Reply
Immassive
New Contributor II
  • 0 kudos

Note: I can see the tables of the system.information schema in the UI of Databricks and read them there.

  • 0 kudos
JonLaRose
by New Contributor III
  • 4852 Views
  • 2 replies
  • 0 kudos

Resolved! Max amount of tables

Hi!What is the maximum amount of tables that is possible to create in a Unity catalog?Is there any difference between managed and external tables? If so, what is the limit for external tables? Thanks,Jonathan.

  • 4852 Views
  • 2 replies
  • 0 kudos
Latest Reply
JonLaRose
New Contributor III
  • 0 kudos

answer is here:https://docs.databricks.com/en/data-governance/unity-catalog/index.html#resource-quotas

  • 0 kudos
1 More Replies
coltonflowers
by New Contributor III
  • 2729 Views
  • 0 replies
  • 0 kudos

MLFlow Spark UDF Error

After trying to run spark_udf = mlflow.pyfunc.spark_udf(spark, model_uri=logged_model,env_manager="virtualenv")We get the following error:org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 145.0 failed 4 times, most re...

  • 2729 Views
  • 0 replies
  • 0 kudos
alj_a
by New Contributor III
  • 1163 Views
  • 1 replies
  • 0 kudos

source db and target db in DLT

Hi,Thanks in advance.I am new in DLT, the scenario is i need to read the data from cloud storage(ADLS) and load it into my bronze table. and read it from bronz table -> do some DQ checks and load the cleaned data into my silver table. finally populat...

  • 1163 Views
  • 1 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Hi,Thanks in advance.I am new in DLT, the scenario is i need to read the data from cloud storage(ADLS) and load it into my bronze table. and read it from bronz table -> do some DQ checks and load the cleaned data into my silver table. finally populat...

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
marianopenn
by New Contributor III
  • 2527 Views
  • 2 replies
  • 1 kudos

Databricks VSCode Extension Sync Timeout

I am using the databricks VSCode extension to sync my local repository to Databricks Workspaces. I have everything configured such that smaller syncs work fine, but a full sync of my repository leads to the following error:Sync Error: Post "https://<...

Data Engineering
dbx sync
Repos
VSCode
Workspaces
  • 2527 Views
  • 2 replies
  • 1 kudos
Latest Reply
kimongrigorakis
New Contributor II
  • 1 kudos

Same issue here..... Can someone please help??

  • 1 kudos
1 More Replies
Phani1
by Valued Contributor II
  • 834 Views
  • 0 replies
  • 0 kudos

Unity catalog accounts

Hi Team,We have the requirement to have metadata(Unity catalog) in one AWS account and data storage(Delta tables under data) in another account, is it possible to do that , Do we face any technical/Security issue??

  • 834 Views
  • 0 replies
  • 0 kudos
278875
by New Contributor
  • 10072 Views
  • 4 replies
  • 1 kudos

How do I figure out the cost breakdown for Databricks

I'm trying to figure out the cost breakdown for the Databricks usage for my team.When I go into the Databricks administration console and click Usage when I select to show the usage By SKU it just displays the type of cluster but not the name of it. ...

  • 10072 Views
  • 4 replies
  • 1 kudos
Latest Reply
MuthuLakshmi
Databricks Employee
  • 1 kudos

Please check the below docs for usage related informations. The Billable Usage Logs: https://docs.databricks.com/en/administration-guide/account-settings/usage.html You can filter them using tags for more precise information which you are looking for...

  • 1 kudos
3 More Replies
dave_d
by New Contributor II
  • 5109 Views
  • 2 replies
  • 0 kudos

What is the "Columnar To Row" node in this simple Databricks SQL query profile?

I am running a relatively simple SQL query that writes back to a table on a Databricks serverless SQL warehouse, and I'm trying to understand why there is a "Columnar To Row" node in the query profile that is consuming the vast majority of the time s...

dave_d_0-1696974904324.png
  • 5109 Views
  • 2 replies
  • 0 kudos
Latest Reply
Annapurna_Hiriy
Databricks Employee
  • 0 kudos

 @dave_d We do not have a document with list of operations that would bring up ColumnarToRow node. This node provides a common executor to translate an RDD of ColumnarBatch into an RDD of InternalRow. This is inserted whenever such a transition is de...

  • 0 kudos
1 More Replies
Rafal9
by New Contributor II
  • 8346 Views
  • 0 replies
  • 0 kudos

Issue during testing SparkSession.sql() with pytest.

Dear Community,I am testing pyspark code via pytest using VS code and Databricks Connect.SparkSession is initiated from Databricks Connect: from databricks.connect import DatabricksSessionspark = DatabricksSession.builder.getOrCreate()I am  receiving...

  • 8346 Views
  • 0 replies
  • 0 kudos
svrdragon
by New Contributor
  • 1855 Views
  • 0 replies
  • 0 kudos

optimizeWrite takes too long

Hi , We have a spark job write data in delta table for last 90 date partition. We have enabled spark.databricks.delta.autoCompact.enabled and delta.autoOptimize.optimizeWrite. Job takes 50 mins to complete. In that logic takes 12 mins and optimizewri...

  • 1855 Views
  • 0 replies
  • 0 kudos
erigaud
by Honored Contributor
  • 3376 Views
  • 3 replies
  • 0 kudos

Merge DLT with Delta Table

Is there anyway to accomplish this ? I have an existing Delta Table and a separate Delta Live Table pipelines and I would like to merge data from a DLT to my existing Delta Table. Is this doable or completely impossible ?

  • 3376 Views
  • 3 replies
  • 0 kudos
Latest Reply
LeifBruen
New Contributor II
  • 0 kudos

Merging data from a Delta Live Table (DLT) into an existing Delta Table is possible with careful planning. Transition data from DLT to Delta Table through batch processing, data transformation, and ETL processes, ensuring schema compatibility. 

  • 0 kudos
2 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels