cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

GaneshI
by New Contributor
  • 132 Views
  • 1 replies
  • 0 kudos

What is the recommended approach to enforce row-level security in Unity Catalog for external BI tool

We connect Tableau and Power BI to our Databricks SQL warehouse via OAuth tokens. Does Unity Catalog row filters apply at the SQL layer regardless of the BI tool, or do we need additional enforcement at the warehouse level?

  • 132 Views
  • 1 replies
  • 0 kudos
Latest Reply
Lu_Wang_ENB_DBX
Databricks Employee
  • 0 kudos

Unity Catalog row filters apply at the SQL/query layer, so if Tableau or Power BI is querying a Databricks SQL warehouse, the filters are enforced there — you do not need a separate warehouse-level row-filter feature. Row filters and column masks are...

  • 0 kudos
DazzaiDe
by New Contributor III
  • 165 Views
  • 2 replies
  • 1 kudos

Best Practices: 1 job per 1 target table

We’re currently designing our Medallion Architecture pipelines using Lakeflow Jobs, and I wanted to get some opinions on orchestration best practices.Right now, our approach is essentially 1 job per target table (for example, each Bronze/Silver/Gold ...

  • 165 Views
  • 2 replies
  • 1 kudos
Latest Reply
LBoydston
New Contributor II
  • 1 kudos

We typically organize our workloads with one job per catalog, and then use one or more pipelines to load tables into the appropriate schemas. As our data engineers ingest raw data, this structure is primarily applied in the Silver and Gold layers of ...

  • 1 kudos
1 More Replies
Garybary
by New Contributor III
  • 1674 Views
  • 3 replies
  • 2 kudos

Resolved! Scheduling jobs with table update triggers

Hi all,Lately I've been experimenting with the newish feature of scheduling jobs on a table update trigger. There's one thing thats blokcing me from implementing it however and I was hoping someone found a solution to it.We occasionally perform a vac...

  • 1674 Views
  • 3 replies
  • 2 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 2 kudos

Hi @Garybary, Quick clarification on how table update triggers actually behave, because this changes the answer significantly. Table update triggers fire on data-changing operations only (writes, merges, updates, deletes). A standalone VACUUM does NO...

  • 2 kudos
2 More Replies
TalessRocha
by New Contributor II
  • 6163 Views
  • 11 replies
  • 8 kudos

Resolved! Connect to azure data lake storage using databricks free edition

Hello guys, i'm using databricks free edition (serverless) and i am trying to connect to a azure data lake storage.The problem I'm having is that in the free edition we can't configure the cluster so I tried to make the connection via notebook using ...

  • 6163 Views
  • 11 replies
  • 8 kudos
Latest Reply
pjvi
New Contributor II
  • 8 kudos

If you want to read from your Azure storage account using Databricks Free Edition, you can add a specific option when reading:spark.read.option("fs.azure.account.key.<storage-account-name>.dfs.core.windows.net",                  "your_storage_account...

  • 8 kudos
10 More Replies
maikel
by Contributor II
  • 516 Views
  • 4 replies
  • 1 kudos

Resolved! Uploading file to volume and start ingestion job

Hello Community!I am writing to you with my idea about data ingestion job which we have to implement in our project.The data which we have are in CSV file format and depending on the case it differs a little bit. Before uploading we pivoting csv file...

  • 516 Views
  • 4 replies
  • 1 kudos
Latest Reply
maikel
Contributor II
  • 1 kudos

Yeah, understood. Thank you very much once again! 

  • 1 kudos
3 More Replies
Danish11052000
by Contributor
  • 1265 Views
  • 7 replies
  • 1 kudos

Resolved! How should I correctly extract the full table name from request_params in audit logs?

’m trying to build a UC usage/refresh tracking table for every workspace. For each workspace, I want to know how many times a UC table was refreshed or accessed each month. To do this, I’m reading the Databricks audit logs and I need to extract only ...

  • 1265 Views
  • 7 replies
  • 1 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 1 kudos

Hi @Danish11052000, You are on the right track with the COALESCE approach. The reason for the inconsistency is that different Unity Catalog action types populate different keys in request_params. Here is a breakdown of the key fields and which action...

  • 1 kudos
6 More Replies
mnissen1337
by New Contributor II
  • 170 Views
  • 1 replies
  • 0 kudos

Resolved! Managing Unity Catalog Permissions for Databricks Apps via DABs

I’m currently developing a Databricks App, and the app’s service principal needs access to Unity Catalog tables. From what I can tell, it doesn’t seem possible to grant Unity Catalog permissions through DABs yet — only through the UI, based on the cu...

  • 170 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @mnissen1337 ,But there is a way to do this in DABs. Look at following section in documentation:Manage Databricks apps using Declarative Automation Bundles | Databricks on AWSIf my answer was helpful, please consider marking it as accepted solutio...

  • 0 kudos
sminamioka
by New Contributor III
  • 337 Views
  • 5 replies
  • 1 kudos

Compute tab doesn't show and doesn't give the option to create a cluster

I've just created an Azure Databricks workspace, tier (Premium) and when trying to create a cluster, when I click on compute, the UI opens automatically the menu SQL Warehouse, not sure if it's a glitch as shown below. Someone said "Ask the admin to ...

sminamioka_0-1778276402869.png
Data Engineering
cluster
clusters
  • 337 Views
  • 5 replies
  • 1 kudos
Latest Reply
gcj0310
Databricks Partner
  • 1 kudos

Hi @sminamioka This does not look like a UI glitch. In newer Azure Databricks workspaces, access to classic compute / clusters depends on workspace entitlements and compute policy permissions.If clicking Compute takes you directly to SQL Warehouses, ...

  • 1 kudos
4 More Replies
Guillermo-HR
by New Contributor
  • 120 Views
  • 1 replies
  • 0 kudos

Streaming read and writing with aggregation

Hi,I have the following problem: on a medallion architecture on a bronze volume I get files every month containing the data for each sensor reading during the period 1 of month 00:00 to last day 23:00. I have a manual job that calls the python files ...

  • 120 Views
  • 1 replies
  • 0 kudos
Latest Reply
Saritha_S
Databricks Employee
  • 0 kudos

Hi @Guillermo-HR  Yes — batch is usually the right fix here. What’s happening is that your query is using event-time window aggregation in Structured Streaming with append output mode. In that mode, Spark only emits a window after it is sure the wind...

  • 0 kudos
Radeesh
by New Contributor
  • 154 Views
  • 2 replies
  • 0 kudos

unable to download data ingestion with lake flow Notebook

I have registered for the Data Engineer Learning Plan, but I am unable to set up the lab shown in the video. Additionally, I cannot find where to download the notebook ZIP file. Could you please help me with this?

  • 154 Views
  • 2 replies
  • 0 kudos
Latest Reply
Ashwin_DSA
Databricks Employee
  • 0 kudos

Hi @Radeesh, Can you clarify which particular module you are referring to? Unfortunately, notebooks are not available for download in the current self-paced course. The narration is inherited from an earlier/instructor-led version of the material whe...

  • 0 kudos
1 More Replies
theanhdo
by New Contributor III
  • 4532 Views
  • 5 replies
  • 1 kudos

Run continuous job for a period of time

Hi there,I have a job where the Trigger type is configured as Continuous. I want to only run the Continuous job for a period of time per day, e.g. 8AM - 5PM. I understand that we can achieve it by manually starting and cancelling the job on the UI, o...

  • 4532 Views
  • 5 replies
  • 1 kudos
Latest Reply
KrisJohannesen
Contributor
  • 1 kudos

The "not-so-pretty-but-it-works" solution I have come across is exactly what you are hinting at yourself.Create the Continuous job - have it be pausedCreate a secondary "start job"-job - which is basically just that API call in a notebook or python f...

  • 1 kudos
4 More Replies
Areqio
by New Contributor II
  • 226 Views
  • 2 replies
  • 1 kudos

trying to send data from a stream table to an azure event hub in a serverless cluster

Is there a way to stream data from Databricks to Azure event hubs in a serverless pipeline environment without using the azure-eventhub library, since it isn’t compatible with serverless pipelines, and instead rely solely on the Kafka-compatible inte...

  • 226 Views
  • 2 replies
  • 1 kudos
Latest Reply
amirabedhiafi
New Contributor III
  • 1 kudos

Hello @Areqio !Yes, you can use Azure event hubs through its Kafka compatible endpoint and not the azure-eventhubs-spark / azure-eventhub connector. JVM libraries are not allowed in LSDP and event hubs should be accessed through the built in Spark Ka...

  • 1 kudos
1 More Replies
HTD360
by New Contributor III
  • 278 Views
  • 3 replies
  • 4 kudos

Autoscaling with the autoloader without SDP

Hi there,I have a question regarding the autoloader without SDP and auto-scaling of clusters. I'm reading the following in the docs:Production considerations for Structured Streaming | Databricks on AWS:Do not enable autoscaling for compute for Struc...

  • 278 Views
  • 3 replies
  • 4 kudos
Latest Reply
HTD360
New Contributor III
  • 4 kudos

Hi, thank you for your answer. Could you elaborate a bit on this?for non SDP available now auto loader jobs autoscaling can be reasonableHow do you decide on whether it is reasonable or not? Especially you said it is not recommended to enable compute...

  • 4 kudos
2 More Replies
Abhishek_sinha
by New Contributor II
  • 228 Views
  • 2 replies
  • 3 kudos

Connecting DBeaver to Databricks Lakebase — Setup & Troubleshooting

I recently connected DBeaver to Databricks Lakebase and wanted to share the setup steps along with a couple of troubleshooting issues I encountered.Since Lakebase is PostgreSQL-compatible, the standard PostgreSQL driver works directly without requiri...

  • 228 Views
  • 2 replies
  • 3 kudos
Latest Reply
amirabedhiafi
New Contributor III
  • 3 kudos

Hello @Abhishek_sinha  ! Thanks for sharing this ! very useful  Few things I can add (from my personal XP), it is better to use the PostgreSQL driver and not the DBKS JDBC driver because Lakebase is PostgreSQL compatible so DBeaver should be configur...

  • 3 kudos
1 More Replies
Labels