cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

viniciuscini
by New Contributor
  • 5463 Views
  • 2 replies
  • 0 kudos

Improve query performance of direct query with Databricks

I’m building a dashboard in Power BI’s Pro Workspace, connecting data via Direct Query from Databricks (around 60 million rows from 15 combined tables), using a SQL Serverless (small size and 4 clusters).The problem is that the dashboard is taking to...

  • 5463 Views
  • 2 replies
  • 0 kudos
Latest Reply
ArekKemp
New Contributor II
  • 0 kudos

@viniciuscini have you managed to get it working well for you?

  • 0 kudos
1 More Replies
Rezakorehi
by New Contributor II
  • 813 Views
  • 7 replies
  • 15 kudos

Unity catalogues - What would you do

If you were creating Unity Catalogs again, what would you do differently based on your past experience?

  • 813 Views
  • 7 replies
  • 15 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 15 kudos

@nayan_wylde no don't do that hehe. It was example of extreme approach. Usually use catalog to separate environment + in enterprises to separate divisions like customer tower, marketing tower, finance tower etc

  • 15 kudos
6 More Replies
YuriS
by New Contributor II
  • 562 Views
  • 3 replies
  • 2 kudos

Resolved! How to reduce data loss for Delta Lake on Azure when failing from primary to secondary regions?

Let’s say we have big data application where data loss is not an option.Having GZRS (geo-zone-redundant storage) redundancy we would achieve zero data loss if primary region is alive – writer is waiting for acks from two or more Azure availability zo...

  • 562 Views
  • 3 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Databricks is working on improvements and new functionality related to that. For now, the only solution is a DEEP CLONE. You can run it more frequently or implement your own replication based on a change data feed. You could use delta sharing for tha...

  • 2 kudos
2 More Replies
VamsiDatabricks
by New Contributor II
  • 332 Views
  • 2 replies
  • 1 kudos

Delta comparison architecture using flatMapGroupsWithState in Structured Streaming

 I am designing structured streaming job in Azure data bricks(using Scala) which will consume messages from two event hubs, lets call them source and target.I would like your feedback on below flow, whether it is will survive the production load and ...

  • 332 Views
  • 2 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

It is hard to understand what the source is and what the target is. Some charts could be useful. Also, information on how long the state is kept. My solution usually is:- Use declarative lakeflow pipelines if possible (dlt) - if not, consider handlin...

  • 1 kudos
1 More Replies
Mailendiran
by New Contributor III
  • 804 Views
  • 6 replies
  • 4 kudos

Resolved! Databricks partner Tech Summit FY26 access

I'm trying to access the recordings of Partner Tech Summit FY26 which happened a month back. It says lobby is closed.Is there any other way i can access the recordings. I'm yet to watch the day 2 sessions.

  • 804 Views
  • 6 replies
  • 4 kudos
Latest Reply
Mailendiran
New Contributor III
  • 4 kudos

Hi @saurabh18cs , check link shared by @Advika . Make sure you are logged in using partner account.Link - https://partner-academy.databricks.com/learn/catalog/view/168SS: 

  • 4 kudos
5 More Replies
egor
by New Contributor II
  • 535 Views
  • 4 replies
  • 5 kudos

Resolved! serialized_dashboard

I have a dashboard.json file, for example: {select * from ${{var.table_name}}}. I have job.yml and section serialized_dashboard there? bcs my job runs parallel with dashboard. Can I use variables in databrics.yml if I define the table_variable variab...

  • 535 Views
  • 4 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 5 kudos

I currently use the parameter inside IDENTIFIER(:schema || 'my_table') and the 'bundle scripts' feature to perform substitutions, but I hope for better support soon.

  • 5 kudos
3 More Replies
BenBricks
by New Contributor III
  • 446 Views
  • 4 replies
  • 5 kudos

Resolved! Need help understanding Databricks

Hi,I come from a traditional ETL background and am having trouble understanding some of the cloud hyper scalar features and use cases.I understand Databricks is hosted on a cloud providers. I see the cloud providers have their own tools for ETL, ML/A...

  • 446 Views
  • 4 replies
  • 5 kudos
Latest Reply
BenBricks
New Contributor III
  • 5 kudos

Thanks a lot Gema. For the detailed and meticulous answers.I guess I have to unlearn and relearn everything starting today. 

  • 5 kudos
3 More Replies
rcostanza
by New Contributor III
  • 616 Views
  • 5 replies
  • 3 kudos

Resolved! Stateless streaming with aggregations on a DLT/Lakeflow pipeline

In a DLT pipeline I have a bronze table that ingest files using Autoloader, and a derived silver table that, for this example, just stores the number of rows for each file ingested into bronze. The basic code example: import dlt from pyspark.sql impo...

  • 616 Views
  • 5 replies
  • 3 kudos
Latest Reply
mark_ott
Databricks Employee
  • 3 kudos

For scenarios in Databricks where lower latency is needed for Silver tables but continuous streaming pipelines are not feasible, using jobs or notebooks with foreachBatch running in Structured Streaming mode is a common and recommended approach. This...

  • 3 kudos
4 More Replies
masterelaichi
by New Contributor II
  • 471 Views
  • 4 replies
  • 0 kudos

Data analyst learning plan lab files

Hi all,I am very new to databricks and to this community. I recently signed up for the data analyst learning plan and the data engineering one.The learning platform page seems like confusing maze to navigate! In the course material for the data analy...

  • 471 Views
  • 4 replies
  • 0 kudos
Latest Reply
masterelaichi
New Contributor II
  • 0 kudos

Hi,I managed to find the lab. It wasn't straight-forward at all. It was part of another link and no in the learning path I had signed upThe lab series I am trying to work on is thishttps://partner-academy.databricks.com/learn/courses/3701/aibi-for-da...

  • 0 kudos
3 More Replies
jimoskar
by New Contributor III
  • 481 Views
  • 6 replies
  • 6 kudos

Resolved! Cluster cannot find init script stored in Volume

I have created an init script stored in a Volume which I want to execute on a cluster with runtime 16.4 LTS. The cluster has policy = Unrestricted and Access mode = Standard. I have additionally added the init script to the allowlist. This should be ...

  • 481 Views
  • 6 replies
  • 6 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 6 kudos

Hi @jimoskar ,Since you're using standard access mode you need to add init script to allowlist. Did you add your init script to allowlist? If not, do the following:In your Databricks workspace, click  Catalog.Click the gear icon .Click the metastore ...

  • 6 kudos
5 More Replies
cbhoga
by New Contributor II
  • 254 Views
  • 2 replies
  • 3 kudos

Resolved! Delta sharing with Celonis

Is there is any way/plans of Databricks use Delta sharing to provide data access to Celonis?

  • 254 Views
  • 2 replies
  • 3 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 3 kudos

Hi @cbhoga ,Delta Sharing is an open protocol for secure data sharing. Databricks already supports it natively, so you can publish data using Delta Sharing. However, whether Celonis can directly consume that shared data depends on whether Celonis sup...

  • 3 kudos
1 More Replies
ChristianRRL
by Valued Contributor III
  • 374 Views
  • 3 replies
  • 4 kudos

Performance Comparison: spark.read vs. Autoloader

Hi there, I would appreciate some help to compare the runtime performance of two approaches to performing ELT in Databricks: spark.read vs. Autoloader. We already have a process in place to extract highly nested json data into a landing path, and fro...

  • 374 Views
  • 3 replies
  • 4 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 4 kudos

Hi @ChristianRRL ,For that kind of ingestion scenario autoloader is a winner . It will scale much better than batch approach - especially if we are talking about large number of files.If you configure autoloader with file notification mode it can sca...

  • 4 kudos
2 More Replies
ChristianRRL
by Valued Contributor III
  • 305 Views
  • 1 replies
  • 2 kudos

Resolved! AutoLoader Ingestion Best Practice

Hi there, I would appreciate some input on AutoLoader best practice. I've read that some people recommend that the latest data should be loaded in its rawest form into a raw delta table (i.e. highly nested json-like schema) and from that data the app...

  • 305 Views
  • 1 replies
  • 2 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor III
  • 2 kudos

I think the key thing with holding the raw data in a table, and not transforming that table, is that you have more flexibility at your disposal. There's a great resource available via Databricks Docs for best practices in the Lakehouse. I'd highly re...

  • 2 kudos
ChristianRRL
by Valued Contributor III
  • 580 Views
  • 2 replies
  • 4 kudos

Resolved! What is `read_files`?

Bit of a silly question, but wondering if someone can help me better understand what is `read_files`?read_files table-valued function | Databricks on AWSThere's at least 3 ways to pull raw json data into a spark dataframe:df = spark.read...df = spark...

  • 580 Views
  • 2 replies
  • 4 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor III
  • 4 kudos

Also, @ChristianRRL , with a slight adjustment to the syntax, it does indeed behave like Autoloaderhttps://docs.databricks.com/aws/en/ingestion/cloud-object-storage/auto-loader/patterns?language=SQL I'd also advise looking at the different options th...

  • 4 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels