Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
Here's your Data + AI Summit 2024 - Warehousing & Analytics recap as you use intelligent data warehousing to improve performance and increase your organization’s productivity with analytics, dashboards and insights.
Keynote: Data Warehouse presente...
HiI’m new to data modelling so could use some help.I’m building a personal project using a fairly standard 3NF sales database as the source data. So far I have a pipeline that incrementally extracts data from the source system each day into a Raw sto...
Thank you.From a high level it makes sense. It's the details I'm a bit unclear on. So when a single pipeline runs (let's say today - 10/05/2024) for your architecture it might: Extract records from source system tables that have changed since last ti...
Hello,I'm wondering if there's a method or workaround to execute JDBC table queries in a similar manner to other cluster types. Currently, attempting to do so results in an error stating that only text-based files (such as JSON, Parquet, Delta, etc.)...
We have created a Unity Catalog instance on top of our Lakehouse (built entirely with Azure Databricks). We are using Power BI to develop and serve our analytics and reporting needs. I've granted the "Account Users" group the appropriate privileges f...
Thanks for explaining this! This doesn't do exactly what I was hoping—it doesn't block all access to the workspace. Users can still login and access their own workspace and run SQL queries, explore the catalog, etc. But they ARE blocked from accessin...
In relational data warehouse systems it was best practise to represent date values as YYYYMMDD integer type values in tables. Date comparison could be done easily without using date-functions and with low performance impact.Is this still the recomme...
Hi @DataFarmer I Databricks I will advise you to use date type instead of int, this will make your life much simpler while working on the date type data.
When trying to connect to a SQL warehouse using the JDBC connector with Spark the below error is thrown. Note that connecting directly to a cluster with similar connection parameters works without issue, the error only occurs with SQL Warehouses.py4j...
Same error here, I am trying to save spark dataframe to Delta lake using JDBC driver and pyspark using this code:#Spark session
spark_session = SparkSession.builder \
.appName("RCT-API") \
.config("spark.metrics.namespace", "rct-a...
In order to create a ci/cd pipeline to deliver dashboards (here monitoring), how to export / import a dashboard created in databricks sql dashboard from one workspace to another?Thanks
Hi Is there a command you could use to list all computes configured in your workspace (active and non-active). This would be really helpful for anyone managing the platfrom to pull all the meta data (tags ,etc) and quickly evaluate all the configura...
@Kaizen You've got three ways of doing this:- Using REST API (https://docs.databricks.com/api/workspace/clusters/list),- Using CLI (https://github.com/databricks/cli/blob/main/docs/commands.md#databricks-clusters-list---list-all-clusters)- Using Pyth...
Is there any business use-case where profile_metrics and drift_metrics are used by Databricks customers.If so,kindly provide the scenario where to leverage this feature e.g data lineage,table metadata updates.
hey @pankaj2264. both profile metric and drift metric tables are created and used by Lakehouse monitoring to assess the performance of your model and data over time or relative to a baseline table. you can find all the relevant information here Intro...
I wrote simple code:from pyspark.sql import SparkSession
from pyspark.sql.window import Window
from pyspark.sql.functions import row_number, max
import pyspark.sql.functions as F
streaming_data = spark.read.table("x")
window = Window.partitionBy("BK...
Hi,In my opinion the result is correctWhat needs to be noted in the result is that it is sorted by the "Onboarding_External_LakehouseId" column so if there is "BK_AccountApplicationId" with the same code, it will be partitioned into 2 row_numbersJust...
Hi!I receive three streams from a postgres CDC. These 3 tables, invoices users and products, need to be joined. I want to use a left join with respect the invoices stream. In order to compute correct results and release old states, I use watermarks a...
Hi!I am exploring the read state functionality in spark streaming: https://docs.databricks.com/en/structured-streaming/read-state.htmlWhen I start a streaming query like this: (
...
.writeStream
.option("checkpointLocation", f"{CHECKPOIN...
Hi,I am trying to make Stream - Static join with aggregation with no luck. I have a streaming table where I am getting events with two nasted arraysID Array1 Array21 [1,2] [3,4]I need make two joins to static dictionary tables (without an...
SQL warehouse can auto-terminate after 1 minute, not 5, as in UI. Just run a simple CLI command. Of course, with such a low auto termination, you lose the benefit of CACHE, but for some ad-hoc queries, it is the perfect setup when combined with serve...
Hi @Hubert-Dudek , Hope you are doing well!
Could you please clarify more on your ask here?
However, from the above details, the SQL warehouse mentioned is auto-terminating after 1 minute of inactivity because the Auto stop is set to 1 minute. Howe...
Hi all! The Databricks Looker Studio connector has now been available for a few weeks. Tested the connector but running into several issues: I am used to working with dynamic queries, so I am able to use date parameters (similar to BigQuery Looker St...
Hi @Retired_mod Hope you're doing well! I am very curious about the following thing: However, there might be workarounds or alternative approaches to achieve similar functionality. You could explore using Looker’s native features for dynamic filterin...
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.