cancel
Showing results for 
Search instead for 
Did you mean: 
Warehousing & Analytics
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

MadelynM
by Databricks Employee
  • 3078 Views
  • 0 replies
  • 0 kudos

[Recap] Data + AI Summit 2024 - Warehousing & Analytics | Improve performance and increase insights

Here's your Data + AI Summit 2024 - Warehousing & Analytics recap as you use intelligent data warehousing to improve performance and increase your organization’s productivity with analytics, dashboards and insights.  Keynote: Data Warehouse presente...

Screenshot 2024-07-03 at 10.15.26 AM.png
Warehousing & Analytics
AI BI Dashboards
AI BI Genie
Databricks SQL
  • 3078 Views
  • 0 replies
  • 0 kudos
patilsuhasv
by New Contributor
  • 3048 Views
  • 1 replies
  • 0 kudos

Dela Table and history

Hi All,How can I maintain 7 years of transactional data in delta table? Can I have log retention of 7 days, but data retention of 7 years?Appreciate your response.Thanks and regards Suhas

  • 3048 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Hey @patilsuhasv ,  Yes, you absolutely can maintain 7 years of transactional data in a Delta table while having only 7 days of log retention. These are two separate concepts that work independently. Understanding the Difference Log Retention control...

  • 0 kudos
AndreasWagner
by New Contributor II
  • 6481 Views
  • 2 replies
  • 1 kudos

PowerBI connected to SAP Databricks

hi everyone, does somebody have experience with connecting PowerBI to SAP Databricks in the BDC? I have quite a few SAP customers interested in that ... many thanks, Andreas

  • 6481 Views
  • 2 replies
  • 1 kudos
Latest Reply
DanielFroehler
New Contributor
  • 1 kudos

Hi @WiliamRosa ,do you know which feature is missing in SAP Databricks that it doesnt work? As Andreas said- everybody is asking that question.KR Daniel

  • 1 kudos
1 More Replies
CEH
by New Contributor II
  • 453 Views
  • 5 replies
  • 2 kudos

Union of tiny dataframes exhausts resource, memory error

As part of a function I create df1 and df2 and aim to stack them and output the results.  But the results do not display within the function, nor if I output the results and display after.results = df1.unionByName(df2, allowMissingColumns=False)displ...

  • 453 Views
  • 5 replies
  • 2 kudos
Latest Reply
Advika
Databricks Employee
  • 2 kudos

Hello @CEH! Did any of the suggestions above help resolve the issue?If so, please mark the most helpful reply as the accepted solution. Or, if you found another fix, please share it with the community so others can benefit as well.

  • 2 kudos
4 More Replies
Akshay_Petkar
by Valued Contributor
  • 7483 Views
  • 9 replies
  • 7 kudos

Resolved! Need a Sample MERGE INTO Query for SCD Type 2 Implementation

Can anyone provide a sample MERGE INTO SQL query for implementing SCD Type 2 in Databricks using Delta Tables?

  • 7483 Views
  • 9 replies
  • 7 kudos
Latest Reply
jeffreyaven
Databricks Employee
  • 7 kudos

Here is a simple example using an upstream Delta table with ChangeDataFeed enabled, using table_changes() to get the records with their corresponding operation, this is a 2 step process you need to close out modified or deleted recordsadd new rows (i...

  • 7 kudos
8 More Replies
Kaz
by New Contributor II
  • 9980 Views
  • 4 replies
  • 2 kudos

Show full logs on job log

Is it possible to show the full logs of a databricks job? Currently, the logs are skipped with:*** WARNING: max output size exceeded, skipping output. ***However, I don't believe our log files are more than 20 MB. I know you can press the logs button...

  • 9980 Views
  • 4 replies
  • 2 kudos
Latest Reply
jkb7
New Contributor III
  • 2 kudos

Any news on this topic?Have the limits on the notebook-cell-log output been resolved?

  • 2 kudos
3 More Replies
tarunnagar
by New Contributor II
  • 130 Views
  • 3 replies
  • 2 kudos

Tips for Streamlining Spark Job Development and Debugging in Databricks

Hi everyone,I’m looking to improve the efficiency of developing and debugging Spark jobs within Databricks and wanted to get insights from the community. Spark is incredibly powerful, but as projects grow in complexity, it can become challenging to m...

  • 130 Views
  • 3 replies
  • 2 kudos
Latest Reply
Suheb
New Contributor
  • 2 kudos

Developing and debugging Spark jobs in Databricks can be challenging due to the distributed nature of Spark and the volume of data processed. To streamline your workflow:Leverage Notebooks for Iterative Development:Use Databricks notebooks to write a...

  • 2 kudos
2 More Replies
mausch
by New Contributor
  • 3198 Views
  • 1 replies
  • 0 kudos

CalledProcessError when running dbt

I've been trying to run a dbt project (sourced in Azure DevOps) in Databricks Workflows, but I get this error message:  CalledProcessError: Command 'b'\nmkdir -p "/tmp/tmp-dbt-run-1124228490001263"\nunexpected_errors="$(cp -a -u "/Workspace/Repos/.in...

  • 3198 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

The error you encountered when running your dbt project in Databricks Workflows comes from Databricks trying to copy the entire repository, including the virtual environment (venv) folder and its cached bytecode files (__pycache__), into a temporary ...

  • 0 kudos
rajanator
by New Contributor
  • 163 Views
  • 1 replies
  • 1 kudos

Resolved! Intermittent 400 Error with Power BI Desktop - ODBC Connection to SQL Warehouse

Hi all,I'm experiencing an intermittent connection issue between Power BI Desktop and our Azure Databricks SQL Warehouse and looking for help troubleshootingError Message:ODBC: ERROR [HY000] [Microsoft][ThriftExtension] (14) Unexpected response from ...

  • 163 Views
  • 1 replies
  • 1 kudos
Latest Reply
mark_ott
Databricks Employee
  • 1 kudos

The intermittent ODBC error you’re seeing in Power BI when connecting to Azure Databricks is a recognized issue related to SSL validation interruptions or proxy interference in the Simba ThriftExtension layer. The behavior—random occurrences, tempora...

  • 1 kudos
Bakkie
by New Contributor III
  • 3800 Views
  • 3 replies
  • 4 kudos

Resolved! Databricks Apps based on Streamlit could not find a valid JAVA_HOME installation

We are launching our first Databricks Apps based on Streamlit.The App works when simply running the notebook in our workspace, but fails after deployment due to "could not find a valid JAVA_HOME installation" when running in the system environment.We...

  • 3800 Views
  • 3 replies
  • 4 kudos
Latest Reply
NandiniN
Databricks Employee
  • 4 kudos

Databricks Apps (which use a lightweight, container-based runtime) do not automatically include JVM, best is to use the databricks package to not have dependency issues.

  • 4 kudos
2 More Replies
playnicekids
by New Contributor II
  • 433 Views
  • 2 replies
  • 1 kudos

Resolved! Metric Views

HiI think I’ve found a reproducible bug / or am misunderstanding some syntax / capabilities of Metric Views when joining a calendar scaffold to an SCD2 table.The same SQL query works perfectly, but the Metric View always returns a constant 1 per mont...

  • 433 Views
  • 2 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Hey @playnicekids , I dig some digging and have come up with some helpful hints/tips to get you past your issue:   This behavior is due to how metric view joins are defined and executed.   Diagnosis   The join in your metric view is a many-to-many te...

  • 1 kudos
1 More Replies
Giktator
by New Contributor II
  • 1450 Views
  • 4 replies
  • 1 kudos

Error FAILED_READ_FILE.NO_HINT When Reading File from R2 Storage

 Hi there,I'm encountering the following error while attempting to read a file from R2 storage:[FAILED_READ_FILE.NO_HINT] Error while reading file r2:REDACTED_LOCAL_PART@user_id.r2.cloudflarestorage.com/data/20250128_160228_54805_wpkza_e38751cf-969e-...

  • 1450 Views
  • 4 replies
  • 1 kudos
Latest Reply
NandiniN
Databricks Employee
  • 1 kudos

But the error is from aws and seen when the payload size is incorrectly defined in the contentLength parameter. Caused by: com.amazonaws.SdkClientException: Data read has a different length than the expected: This sounds like a bug, Was this resolved...

  • 1 kudos
3 More Replies
loic
by Contributor
  • 995 Views
  • 2 replies
  • 1 kudos

Resolved! Databricks workspace default catalog not working anymore with JDBC driver

Hello,We recently detected an issue in our product deployment with terraform.At some point, we have some java code that creates a schema in "hive_metastore" catalog.Since "hive_metastore" catalog is the default one, there should not be any need to sp...

  • 995 Views
  • 2 replies
  • 1 kudos
Latest Reply
loic
Contributor
  • 1 kudos

The exact error reported by Databricks is:[RequestId=f27975cd-7589-4463-8c03-6015893ee133 ErrorClass=INVALID_PARAMETER_VALUE] Invalid input: RPC CreateSchema Field managedcatalog.SchemaInfo.catalog_name: name "" is not a valid name 

  • 1 kudos
1 More Replies
meljung
by New Contributor II
  • 1002 Views
  • 1 replies
  • 0 kudos

Resolved! Moving average calculation in Databricks AI/BI dashboard

So, I can't figure out how to do moving average as custom calculation in Databricks dashboard. I'm applying many different filters and the denominator of the metric has to change dynamically based on the chosen filters. So, in this case using `Custom...

Screenshot 2025-06-17 134351.png
  • 1002 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

Currently, Databricks dashboards do not support applying a moving average “custom calculation” on top of another custom metric that itself is dynamic with respect to the filters. Workarounds Segmented SQL Datasets: Pre-compute the filtered sets (as ...

  • 0 kudos
der
by Contributor
  • 663 Views
  • 4 replies
  • 7 kudos

Resolved! Dashboard choropleth map with geometry

We are currently building our dashboards in Apache Superset. With the Git integration in Databricks AI/BI Dashboards, the development process has improved a lot. So we are thinking about switch to Databricks AI/BI Dashboard.One pain point in Databric...

  • 663 Views
  • 4 replies
  • 7 kudos
Latest Reply
NandiniN
Databricks Employee
  • 7 kudos

Closing the loop here: Update: The PMs are updated by me and are aware of the usecase and this request (it will help hasten priortization). Thanks! 

  • 7 kudos
3 More Replies
leo-machado
by New Contributor III
  • 4755 Views
  • 10 replies
  • 5 kudos

(Big) Problem with SQL Warehouse Auto stop

Long story short, I'm not sure if this is an already known problem, but the Auto Stop feature on SQL Warehouses after minutes of inactivity is not working properly.We started using SQL Warehouses more aggressively this December when we scaled up one ...

image (2).png Screenshot 2025-01-02 at 10.31.27.png
  • 4755 Views
  • 10 replies
  • 5 kudos
Latest Reply
HNguyen
New Contributor II
  • 5 kudos

This is a good catch. Auto termination is something you tend to set and trust it will do the right thing .Wondering if the Databricks team managed to fix this, seeing it has been half a year since the problem was raised.This is also important for ou...

  • 5 kudos
9 More Replies