cancel
Showing results for 
Search instead for 
Did you mean: 
Warehousing & Analytics
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

pankaj2264
by New Contributor II
  • 1003 Views
  • 2 replies
  • 1 kudos

Using profile_metrics and drift_metrics

Is there any business use-case where profile_metrics and drift_metrics are used by Databricks customers.If so,kindly provide the scenario where to leverage this feature e.g data lineage,table metadata updates.

  • 1003 Views
  • 2 replies
  • 1 kudos
Latest Reply
MohsenJ
New Contributor III
  • 1 kudos

hey @pankaj2264. both profile metric and drift metric tables are created and used by Lakehouse monitoring to assess the performance of your model and data over time or relative to a baseline table. you can find all the relevant information here Intro...

  • 1 kudos
1 More Replies
techuser
by New Contributor III
  • 4047 Views
  • 10 replies
  • 1 kudos

Resolved! Databricks Liquid Cluster

Hi,Is it possible to convert existing delta table with partition having data to clustering? If so can you please suggest the steps required? I tried and searched but couldn't find any. Is it that liquid clustering can be done only for new Delta table...

  • 4047 Views
  • 10 replies
  • 1 kudos
Latest Reply
Raja_Databricks
New Contributor II
  • 1 kudos

Does Liquid Clustering accepts Merge or How Upsert can be done efficiently with Liquid clustered delta table

  • 1 kudos
9 More Replies
rocky5
by New Contributor III
  • 428 Views
  • 1 replies
  • 0 kudos

Resolved! Incorrect results of row_number() function

I wrote simple code:from pyspark.sql import SparkSession from pyspark.sql.window import Window from pyspark.sql.functions import row_number, max import pyspark.sql.functions as F streaming_data = spark.read.table("x") window = Window.partitionBy("BK...

  • 428 Views
  • 1 replies
  • 0 kudos
Latest Reply
ThomazRossito
New Contributor II
  • 0 kudos

Hi,In my opinion the result is correctWhat needs to be noted in the result is that it is sorted by the "Onboarding_External_LakehouseId" column so if there is "BK_AccountApplicationId" with the same code, it will be partitioned into 2 row_numbersJust...

  • 0 kudos
jcozar
by Contributor
  • 624 Views
  • 2 replies
  • 0 kudos

Join multiple streams with watermarks

Hi!I receive three streams from a postgres CDC. These 3 tables, invoices users and products, need to be joined. I want to use a left join with respect the invoices stream. In order to compute correct results and release old states, I use watermarks a...

  • 624 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @jcozar, It seems you’re encountering an issue with multiple event time columns in your Spark Structured Streaming join. Let’s break down the problem and find a solution. Event Time Columns: In Spark Structured Streaming, event time is crucia...

  • 0 kudos
1 More Replies
jcozar
by Contributor
  • 775 Views
  • 2 replies
  • 0 kudos

Read Structured Streaming state information

Hi!I am exploring the read state functionality in spark streaming: https://docs.databricks.com/en/structured-streaming/read-state.htmlWhen I start a streaming query like this:  ( ... .writeStream .option("checkpointLocation", f"{CHECKPOIN...

  • 775 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @jcozar,  Execute the streaming query again to construct the state schema.Ensure that the checkpoint location (dbfs:/tmp/checkpoints/experiment_2_2) is correct and accessible.

  • 0 kudos
1 More Replies
rocky5
by New Contributor III
  • 146 Views
  • 1 replies
  • 0 kudos

Stream static join with aggregation

Hi,I am trying to make Stream - Static join with aggregation with no luck. I have a streaming table where I am getting events with two nasted arraysID   Array1   Array21     [1,2]     [3,4]I need make two joins to static dictionary tables (without an...

  • 146 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @rocky5,  . You want to perform a stream-static join with aggregation in Databricks SQL, where you have a streaming table with nested arrays and need to join it with static dictionary tables based on IDs contained in those arrays. Here are the ...

  • 0 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 141 Views
  • 1 replies
  • 0 kudos

1 min auto termination

SQL warehouse can auto-terminate after 1 minute, not 5, as in UI. Just run a simple CLI command. Of course, with such a low auto termination, you lose the benefit of CACHE, but for some ad-hoc queries, it is the perfect setup when combined with serve...

1min.png
  • 141 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Honored Contributor
  • 0 kudos

Hi @Hubert-Dudek , Hope you are doing well!  Could you please clarify more on your ask here?  However, from the above details, the SQL warehouse mentioned is auto-terminating after 1 minute of inactivity because the Auto stop is set to 1 minute. Howe...

  • 0 kudos
Noortje
by New Contributor II
  • 441 Views
  • 3 replies
  • 0 kudos

Databricks Looker Studio connector

Hi all! The Databricks Looker Studio connector has now been available for a few weeks. Tested the connector but running into several issues: I am used to working with dynamic queries, so I am able to use date parameters (similar to BigQuery Looker St...

Warehousing & Analytics
BI tool connector
Looker Studio
  • 441 Views
  • 3 replies
  • 0 kudos
Latest Reply
Noortje
New Contributor II
  • 0 kudos

Hi @Kaniz Hope you're doing well! I am very curious about the following thing: However, there might be workarounds or alternative approaches to achieve similar functionality. You could explore using Looker’s native features for dynamic filtering or c...

  • 0 kudos
2 More Replies
Laurens
by New Contributor II
  • 716 Views
  • 3 replies
  • 0 kudos

Setting up a snowflake catalog via spark config next to unity catalog

Im trying to set up a connection to Iceberg on S3 via Snowflake as described https://medium.com/snowflake/how-to-integrate-databricks-with-snowflake-managed-iceberg-tables-7a8895c2c724 and https://docs.snowflake.com/en/user-guide/tables-iceberg-catal...

Warehousing & Analytics
catalog
config
snowflake
spark
Unity Catalog
  • 716 Views
  • 3 replies
  • 0 kudos
Latest Reply
Laurens
New Contributor II
  • 0 kudos

Hi @Kaniz ,We've been working on setting up Glue as catalog, which is working fine so far. However, Glue takes place of the hive_metastore, which appears to be a legacy way of setting this up. Is the way proposed here the recommended way to set it up...

  • 0 kudos
2 More Replies
jensen22
by Contributor
  • 145 Views
  • 1 replies
  • 0 kudos

NEW CERTIFICATION AI FOR DATABRICKS

I'm just curious whether, in the future, Databricks will offer a certification for AI, GenAI, or any other AI-related fields. I'm very interested and looking forward to it.

  • 145 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @jensen22, Thank you for posting your concern on Community! To expedite your request, please list your concerns on our ticketing portal. Our support staff would be able to act faster on the resolution (our standard resolution time is 24-48 hours).

  • 0 kudos
Carsten03
by New Contributor III
  • 470 Views
  • 2 replies
  • 0 kudos

Permission Error When Running DELETE FROM

Hi,I want to remove duplicate rows from my managed delta table in my unity catalog. I use a query on a SQL warehouse similar to this:  WITH cte AS ( SELECT id, ROW_NUMBER() OVER (PARTITION BY id,##,##,## ORDER BY ts) AS row_num FROM catalog.sch...

  • 470 Views
  • 2 replies
  • 0 kudos
Latest Reply
Carsten03
New Contributor III
  • 0 kudos

I have first tried to use _metadata.row_index to delete the correct rows but also this resulted in an error. My solution was now to use spark and overwrite the table.table_name = "catalog.schema.table" df = spark.read.table(table_name) count_df = df....

  • 0 kudos
1 More Replies
Priyam1
by New Contributor III
  • 266 Views
  • 1 replies
  • 0 kudos

databricks notebook cell doesn't show the output intermittently

Recently, it seems that there has been an intermittent issue where the output of a notebook cell doesn't display, even though the code within the cell executes successfully. For instance, there are times when simply printing a dataframe yields no out...

  • 266 Views
  • 1 replies
  • 0 kudos
Latest Reply
Lakshay
Esteemed Contributor
  • 0 kudos

Do you see the output in stdout logfile in such a scenario?

  • 0 kudos
Labels
Top Kudoed Authors