cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ndw
by New Contributor
  • 1 Views
  • 0 replies
  • 0 kudos

Extract Snowflake data based on environment

Hi all, In the development workspace, I need to extract data from a table/view in Snowflake development environment. Example table is called as VD_DWH.SALES.SALES_DETAILWhen we deploy the code into production, it needs to extract data from a table/vi...

  • 1 Views
  • 0 replies
  • 0 kudos
j_unspeakable
by New Contributor III
  • 2103 Views
  • 4 replies
  • 5 kudos

Resolved! Permission Denied when Creating External Tables Using Workspace Default Credential

I’m building out schemas, volumes, and external Delta tables in Unity Catalog via Terraform. The schemas and volumes are created successfully, but all external tables are failing.The error message from Terraform doesn't highlight what the issue is bu...

image.png image.png Screenshot 2025-06-15 152848.png
  • 2103 Views
  • 4 replies
  • 5 kudos
Latest Reply
artopihlaja
New Contributor II
  • 5 kudos

Feature or bug, I discovered the same. I couldn't create tables with the default credential. To test, I assigned the default credential and a custom credential the same access rights to the storage container that is the target of the external locatio...

  • 5 kudos
3 More Replies
Galih
by New Contributor
  • 105 Views
  • 3 replies
  • 3 kudos

Spark structured streaming- calculate signal, help required! 🙏

Hello everyone!I’m very very new to Spark Structured Streaming, and not a data engineer I would appreciate guidance on how to efficiently process streaming data and emit only changed aggregate results over multiple time windows.Input Stream:Source: A...

  • 105 Views
  • 3 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

I would implement stateful streaming by using transformWithStateInPandas to keep the state and implement the logic there. I would avoid doing stream-stream JOINs.

  • 3 kudos
2 More Replies
chirag_nagar
by New Contributor
  • 2562 Views
  • 12 replies
  • 2 kudos

Seeking Guidance on Migrating Informatica PowerCenter Workflows to Databricks using Lakebridge

Hi everyone,I hope you're doing well.I'm currently exploring options to migrate a significant number of Informatica PowerCenter workflows and mappings to Databricks. During my research, I came across Lakebridge, especially its integration with BladeB...

  • 2562 Views
  • 12 replies
  • 2 kudos
Latest Reply
AnnaKing
New Contributor II
  • 2 kudos

Hi Chirag. At Kanerika Inc,, we've built a migration accelerator that automates 80% of the Informatica to Databricks migration process, saving you significant time, effort, and resources. You can check out the demo video of the same here - https://ww...

  • 2 kudos
11 More Replies
Shimon
by New Contributor
  • 75 Views
  • 1 replies
  • 0 kudos

Jackson version conflict

Hi,I am trying to implement the Spark TableProvider api and i am experiencing a jar conflict (I am using the 17.3 runtime). com.fasterxml.jackson.databind.JsonMappingException: Scala module 2.15.2 requires Jackson Databind version >= 2.15.0 and < 2.1...

  • 75 Views
  • 1 replies
  • 0 kudos
Latest Reply
Abeshek
New Contributor
  • 0 kudos

This is a well-documented issue, and it highlights how dependency conflicts in managed runtimes can quickly become a blocker for teams extending Spark beyond standard use cases.We see similar challenges when organizations build custom providers or in...

  • 0 kudos
bercaakbayir
by Visitor
  • 27 Views
  • 1 replies
  • 0 kudos

Data Ingestion - Missing Permission

Hi, I would like to use Data Ingestion through fivetran connectors to get data from external data source to databricks but I am getting missing permission error. I already have admin permission. I kindly ask your help regarding to this situation.Look...

  • 27 Views
  • 1 replies
  • 0 kudos
Latest Reply
Raman_Unifeye
Contributor III
  • 0 kudos

@bercaakbayir - 2 areas to look at for permissions.Unity Catalog PermissionDestination‑level permissionsPlease Check,UC enabled for your workspace. [Metastore Admin, not workspace Admin]CREATE permissions on the target catalog - User or SP should hav...

  • 0 kudos
Phani1
by Databricks MVP
  • 2833 Views
  • 7 replies
  • 0 kudos

Triggering DLT Pipelines with Dynamic Parameters

Hi Team,We have a scenario where we need to pass a dynamic parameter to a Spark job that will trigger a DLT pipeline in append mode. Can you please suggest an approach for this?Regards,Phani

  • 2833 Views
  • 7 replies
  • 0 kudos
Latest Reply
sas30
Databricks Employee
  • 0 kudos

found a working example -databricks pipelines update <pipeline_id> --json @new_config.jsondatabricks pipelines start-update <Pipelineid>where in use JSON for passing parameters.. every run update the parameters with new json file

  • 0 kudos
6 More Replies
der
by Contributor III
  • 812 Views
  • 7 replies
  • 4 kudos

Resolved! EXCEL_DATA_SOURCE_NOT_ENABLED Excel data source is not enabled in this cluster

I want to read an Excel xlsx file on DBR 17.3. On the Cluster the library dev.mauch:spark-excel_2.13:4.0.0_0.31.2 is installed. V1 Implementation works fine:df = spark.read.format("dev.mauch.spark.excel").schema(schema).load(excel_file) display(df)V2...

  • 812 Views
  • 7 replies
  • 4 kudos
Latest Reply
der
Contributor III
  • 4 kudos

I reached out to Databricks support and they fixed it with December 2025 maintenance update. Now the open source excel reader and the new build in should work.https://learn.microsoft.com/en-gb/azure/databricks/query/formats/excel 

  • 4 kudos
6 More Replies
pdiamond
by Contributor
  • 115 Views
  • 1 replies
  • 0 kudos

Lakebase error logs

Anyone know where to see any logs related to Lakebase/Postgres? I have a Tableau Prep flow that is failing but the error is not clear and I'm trying to find out what the database is capturing.

  • 115 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @pdiamond ,You can try to use Lakebase monitoring tools to capture query generated by Tableau Prep.Monitor | Databricks on AWSAlternatively, it seems that you can also use external monitoring tools. So you can connect to your lakebase instance usi...

  • 0 kudos
dvd_lg_bricks
by New Contributor II
  • 408 Views
  • 10 replies
  • 3 kudos

Questions About Workers and Executors Configuration in Databricks

Hi everyone, sorry, I’m new here. I’m considering migrating to Databricks, but I need to clarify a few things first.When I define and launch an application, I see that I can specify the number of workers, and then later configure the number of execut...

  • 408 Views
  • 10 replies
  • 3 kudos
Latest Reply
Abeshek
New Contributor
  • 3 kudos

Your Databricks question about workers versus executors. Many teams encounter the same sizing and configuration issues when evaluating a migration. At Kanerika, we help companies plan cluster architecture, optimize Spark workloads, and avoid overspen...

  • 3 kudos
9 More Replies
michal1228
by New Contributor
  • 199 Views
  • 4 replies
  • 0 kudos

Import Python Modules with Git Folder Error

Dear Databricks Community, We encountered Bug in behaviour of import method explained in documentation https://learn.microsoft.com/en-us/azure/databricks/files/workspace-modules#autoreload-for-python-modules. Couple months ago we migrated our pipelin...

  • 199 Views
  • 4 replies
  • 0 kudos
Latest Reply
michal1228
New Contributor
  • 0 kudos

We're using DBR version 16.4

  • 0 kudos
3 More Replies
Fatimah-Tariq
by New Contributor III
  • 264 Views
  • 7 replies
  • 4 kudos

Resolved! Writing to Foreign catalog

I have a running notebook job where I am doing some processing and writing the tables in a foreign catalog. It has been running successfully for about an year. The job is scheduled and runs on job cluster with DBR 16.2Recently, I had to add new noteb...

  • 264 Views
  • 7 replies
  • 4 kudos
Latest Reply
Fatimah-Tariq
New Contributor III
  • 4 kudos

Thank you @Louis_Frolio! your suggestions really helped me understanding the scenario.

  • 4 kudos
6 More Replies
skuvisk
by New Contributor
  • 136 Views
  • 2 replies
  • 1 kudos

CLS function with lookup fails on dates

Hello,I'm conducting research on utilizing CLS in a project. We are implementing a lookup table to determine what tags a user can see. The CLS function looks like this:CREATE OR REPLACE FUNCTION {catalog}.{schema}.mask_column(value VARIANT, tag STRIN...

  • 136 Views
  • 2 replies
  • 1 kudos
Latest Reply
skuvisk
New Contributor
  • 1 kudos

Thank you for an insightful answer @Poorva21. I conclude from your reasoning that this is the result of an optimization/engine error. It seems like I will need to resort to a workaround for the date columns then...

  • 1 kudos
1 More Replies
Jarno
by New Contributor
  • 187 Views
  • 4 replies
  • 0 kudos

Dangerous implicit type conversions on 17.3 LTS.

Starting with DBR 17 running Spark 4.0, spark.sql.ansi.enabled is set to true by default. With the flag enabled, strings are implicitly converted to numbers in a very dangerous manner. ConsiderSELECT 123='123';SELECT 123='123X';The first one is succe...

  • 187 Views
  • 4 replies
  • 0 kudos
Latest Reply
Jarno
New Contributor
  • 0 kudos

FYI, it seems I was mistaken about the behaviour of '::' on Spark 4.0.1. It does indeed work like CAST on both DBR 17.3 and Spark 4.0.1 and raises an exception on '123X'::int. The '?::' operator seems to be a Databricks only extension at the moment (...

  • 0 kudos
3 More Replies
prashant151
by New Contributor II
  • 199 Views
  • 2 replies
  • 3 kudos

Resolved! Using Init Scipt to execute python notebook at all-purpose cluster level

HiWe have setup.py in my databricks workspace.This script is executed in other transformation scripts using%run /Workspace/Common/setup.pywhich consume lot of time. This setup.py internally calls other utilities notebooks using %run%run /Workspace/Co...

  • 199 Views
  • 2 replies
  • 3 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 3 kudos

You can’t “%run a notebook” from a cluster init script—init scripts are shell-only and meant for environment setup (install libs, set env vars), not for executing notebooks or sharing Python state across sessions. +1 to what @Raman_Unifeye has told. ...

  • 3 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels