cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

pdiamond
by Contributor
  • 75 Views
  • 1 replies
  • 0 kudos

Lakebase error logs

Anyone know where to see any logs related to Lakebase/Postgres? I have a Tableau Prep flow that is failing but the error is not clear and I'm trying to find out what the database is capturing.

  • 75 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @pdiamond ,You can try to use Lakebase monitoring tools to capture query generated by Tableau Prep.Monitor | Databricks on AWSAlternatively, it seems that you can also use external monitoring tools. So you can connect to your lakebase instance usi...

  • 0 kudos
dvd_lg_bricks
by New Contributor
  • 300 Views
  • 10 replies
  • 3 kudos

Questions About Workers and Executors Configuration in Databricks

Hi everyone, sorry, I’m new here. I’m considering migrating to Databricks, but I need to clarify a few things first.When I define and launch an application, I see that I can specify the number of workers, and then later configure the number of execut...

  • 300 Views
  • 10 replies
  • 3 kudos
Latest Reply
Abeshek
New Contributor
  • 3 kudos

Your Databricks question about workers versus executors. Many teams encounter the same sizing and configuration issues when evaluating a migration. At Kanerika, we help companies plan cluster architecture, optimize Spark workloads, and avoid overspen...

  • 3 kudos
9 More Replies
michal1228
by New Contributor
  • 153 Views
  • 4 replies
  • 0 kudos

Import Python Modules with Git Folder Error

Dear Databricks Community, We encountered Bug in behaviour of import method explained in documentation https://learn.microsoft.com/en-us/azure/databricks/files/workspace-modules#autoreload-for-python-modules. Couple months ago we migrated our pipelin...

  • 153 Views
  • 4 replies
  • 0 kudos
Latest Reply
michal1228
New Contributor
  • 0 kudos

We're using DBR version 16.4

  • 0 kudos
3 More Replies
Fatimah-Tariq
by New Contributor III
  • 196 Views
  • 7 replies
  • 4 kudos

Resolved! Writing to Foreign catalog

I have a running notebook job where I am doing some processing and writing the tables in a foreign catalog. It has been running successfully for about an year. The job is scheduled and runs on job cluster with DBR 16.2Recently, I had to add new noteb...

  • 196 Views
  • 7 replies
  • 4 kudos
Latest Reply
Fatimah-Tariq
New Contributor III
  • 4 kudos

Thank you @Louis_Frolio! your suggestions really helped me understanding the scenario.

  • 4 kudos
6 More Replies
skuvisk
by New Contributor
  • 104 Views
  • 2 replies
  • 1 kudos

CLS function with lookup fails on dates

Hello,I'm conducting research on utilizing CLS in a project. We are implementing a lookup table to determine what tags a user can see. The CLS function looks like this:CREATE OR REPLACE FUNCTION {catalog}.{schema}.mask_column(value VARIANT, tag STRIN...

  • 104 Views
  • 2 replies
  • 1 kudos
Latest Reply
skuvisk
New Contributor
  • 1 kudos

Thank you for an insightful answer @Poorva21. I conclude from your reasoning that this is the result of an optimization/engine error. It seems like I will need to resort to a workaround for the date columns then...

  • 1 kudos
1 More Replies
Jarno
by New Contributor
  • 143 Views
  • 4 replies
  • 0 kudos

Dangerous implicit type conversions on 17.3 LTS.

Starting with DBR 17 running Spark 4.0, spark.sql.ansi.enabled is set to true by default. With the flag enabled, strings are implicitly converted to numbers in a very dangerous manner. ConsiderSELECT 123='123';SELECT 123='123X';The first one is succe...

  • 143 Views
  • 4 replies
  • 0 kudos
Latest Reply
Jarno
New Contributor
  • 0 kudos

FYI, it seems I was mistaken about the behaviour of '::' on Spark 4.0.1. It does indeed work like CAST on both DBR 17.3 and Spark 4.0.1 and raises an exception on '123X'::int. The '?::' operator seems to be a Databricks only extension at the moment (...

  • 0 kudos
3 More Replies
prashant151
by New Contributor II
  • 164 Views
  • 2 replies
  • 2 kudos

Resolved! Using Init Scipt to execute python notebook at all-purpose cluster level

HiWe have setup.py in my databricks workspace.This script is executed in other transformation scripts using%run /Workspace/Common/setup.pywhich consume lot of time. This setup.py internally calls other utilities notebooks using %run%run /Workspace/Co...

  • 164 Views
  • 2 replies
  • 2 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 2 kudos

You can’t “%run a notebook” from a cluster init script—init scripts are shell-only and meant for environment setup (install libs, set env vars), not for executing notebooks or sharing Python state across sessions. +1 to what @Raman_Unifeye has told. ...

  • 2 kudos
1 More Replies
nick_heybuddy
by New Contributor
  • 99 Views
  • 1 replies
  • 2 kudos

Notebooks suddenly fails to retrieve Databricks secrets

At around 5:30 am (UTC+11) this morning, a number of our scheduled serverless notebook jobs started failing when attempting to retrieve Databricks secrets.We are able to retrieve the secrets using the databricks CLI and the jobs are run as a user tha...

Screenshot 2025-12-12 at 8.46.44 am.png Screenshot 2025-12-12 at 8.47.57 am.png
  • 99 Views
  • 1 replies
  • 2 kudos
Latest Reply
liu
Contributor
  • 2 kudos

me tooBut it looks like there hasn't been any official reply regarding this matter yet.

  • 2 kudos
demo-user
by New Contributor
  • 90 Views
  • 3 replies
  • 0 kudos

Connecting to an S3 compatible bucket

Hi everyone,I’m trying to connect Databricks to an S3-compatible bucket using a custom endpoint URL and access keys.I’m using an Express account with Serverless SQL Warehouses, but the only external storage options I see are AWS IAM roles or Cloudfla...

  • 90 Views
  • 3 replies
  • 0 kudos
Latest Reply
Raman_Unifeye
Contributor III
  • 0 kudos

Serverless compute does not support setting most Apache Spark configuration properties irrespective of Enterprise Tier as dB fully manages the underlying infrastructure.

  • 0 kudos
2 More Replies
mai_luca
by Contributor
  • 104 Views
  • 3 replies
  • 4 kudos

Resolved! What's the difference between dbmanagedidentity and a storage credential based on managed identity?

I’m looking for guidance on the differences between:dbmanagedidentity (the workspace-managed identity), andUnity Catalog storage credentials based on Azure Managed IdentitySpecifically, I’d like to understand:What are the key differences between thes...

  • 104 Views
  • 3 replies
  • 4 kudos
Latest Reply
Raman_Unifeye
Contributor III
  • 4 kudos

use dbmanageidentity for non‑storage Azure services, such as Cosmos DB, Azure SQL, Event Hub, Key vault.

  • 4 kudos
2 More Replies
Malthe
by Contributor III
  • 648 Views
  • 5 replies
  • 6 kudos

Self-referential foreign key constraint for streaming tables

When defining a streaming tables using DLT (declarative pipelines), we can provide a schema which lets us define primary and foreign key constraints.However, references to self, i.e. the defining table, are not currently allowed (you get a "table not...

  • 648 Views
  • 5 replies
  • 6 kudos
Latest Reply
Malthe
Contributor III
  • 6 kudos

Each of these workarounds give up the optimizations that are enabled by the use of key constraints.

  • 6 kudos
4 More Replies
Hari_P
by New Contributor II
  • 970 Views
  • 4 replies
  • 0 kudos

IBM DataStage to Databricks Migration

Hi All,We are currently exploring a use case involving migration from IBM DataStage to Databricks. I noticed that LakeBridge supports automated code conversion for this process. If anyone has experience using LakeBridge, could you please share any be...

  • 970 Views
  • 4 replies
  • 0 kudos
Latest Reply
Kevin8
New Contributor II
  • 0 kudos

Hi @Echoes  @Hari_P  @SebastianRowan  you can you Travinto technologies tool, their conversion ratio is 95-100%.

  • 0 kudos
3 More Replies
RobFer1985
by New Contributor
  • 92 Views
  • 1 replies
  • 0 kudos

Databricks pipeline fails expectation on execute python script, throws error: Update FAILES

Hi Community,I'm new to Databricks and am trying to make and implement pipeline expectations, The pipelines work without errors and my job works. I've tried multiple ways to implement expectations, sql and python. I keep resolving the errors but end ...

  • 92 Views
  • 1 replies
  • 0 kudos
Latest Reply
emma_s
Databricks Employee
  • 0 kudos

Hey,  I think it may be the row_count condition causing the issue. The expectation runs on each row and sees if the record meets the criteria in the expectation, so you're effectively asking count * on each record, which will always evaluate to 1 and...

  • 0 kudos
SRJDB
by New Contributor II
  • 56 Views
  • 1 replies
  • 1 kudos

Resolved! How to stop Databricks retaining widget selection between runs?

I have a Python notebook in Databricks. Within it I have a multiselect widget, which is defined like this:widget_values = spark.sql(f''' SELECT my_column FROM my_table GROUP BY my_column ORDER BY my_column ''') widget_values = widget_values.collect(...

  • 56 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Hello @SRJDB , What you’re running into isn’t your Python variable misbehaving—it’s the widget hanging onto its own internal state. A Databricks widget will happily keep whatever value you gave it, per user and per notebook, until you explicitly clea...

  • 1 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels