cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

fix_databricks
by New Contributor II
  • 4360 Views
  • 3 replies
  • 0 kudos

Cannot run another notebook from same directory

Hello, I am having a similar problem from this thread which was never resolved: https://community.databricks.com/t5/data-engineering/unexpected-error-while-calling-notebook-string-matching-regex-w/td-p/18691 I renamed a notebook (utility_data_wrangli...

  • 4360 Views
  • 3 replies
  • 0 kudos
Latest Reply
ddundovic
New Contributor III
  • 0 kudos

I am running into the same issue. It seems like the `%run` magic command is trying to parse the entire cell content as its arguments. So if you have%run "my_notebook" print("hello")in the same cell, you will get the following error: `Failed to parse...

  • 0 kudos
2 More Replies
Raj_DB
by Contributor
  • 2677 Views
  • 9 replies
  • 12 kudos

Resolved! Pass Notebook parameters dynamically in Job task.

Hi Everyone, I'm working on scheduling a job and would like to pass parameters that I've defined in my notebook. Ideally, I'd like these parameters to be dynamic meaning that if I update their values in the notebook, the scheduled job should automati...

Raj_DB_0-1756383510542.png
  • 2677 Views
  • 9 replies
  • 12 kudos
Latest Reply
ck7007
Contributor II
  • 12 kudos

I see you're using dbutils.widgets. text and dropdown—perfect! You're already on the right track.Quick SolutionYour widgets are already dynamic! Just pass parameters in your job configuration:In your notebook (slight refactor of your code):# Define w...

  • 12 kudos
8 More Replies
Erik
by Valued Contributor III
  • 19774 Views
  • 13 replies
  • 8 kudos

Grafana + databricks = True?

We have some timeseries in databricks, and we are reading them into powerbi through sql compute endpoints. For timeseries powerbi is ... not optimal. Earlier I have used grafana with various backends, and quite like it, but I cant find any way to con...

  • 19774 Views
  • 13 replies
  • 8 kudos
Latest Reply
frugson
New Contributor II
  • 8 kudos

@Erik wrote:We have some timeseries in databricks, and we are reading them into powerbi through sql compute endpoints. For timeseries powerbi is ... not optimal. Earlier I have used grafana with various backends, and quite like it, but I cant find an...

  • 8 kudos
12 More Replies
ChristianRRL
by Honored Contributor
  • 2252 Views
  • 6 replies
  • 5 kudos

Resolved! Can schemaHints dynamically handle nested json structures?

Hi there, as I'm learning more about schemaHints, it seems like an incredibly useful way to unpack some of my json data. However, I've hit what is either a limitation of schemaHints or of my understanding of how to use it properly.Below I have an exa...

ChristianRRL_0-1755017647152.png ChristianRRL_2-1755018810157.png ChristianRRL_3-1755019651215.png ChristianRRL_6-1755020076906.png
  • 2252 Views
  • 6 replies
  • 5 kudos
Latest Reply
boitumelodikoko
Valued Contributor
  • 5 kudos

Hi @ChristianRRL,Would you be able to share a small sample of your JSON file (with any sensitive data removed)? That way, I can try to replicate your use case and see if we can get schemaHints working across multiple nested fields without losing data...

  • 5 kudos
5 More Replies
ebyhr
by New Contributor II
  • 12474 Views
  • 8 replies
  • 3 kudos

How to fix intermittent 503 errors in 10.4 LTS

I sometimes get the below error recently in version 10.4 LTS. Any solution to fix the intermittent failure? I added retry logic in our code, but Databricks query succeeded (even though it threw an exception) and it leads to the unexpected table statu...

  • 12474 Views
  • 8 replies
  • 3 kudos
Latest Reply
niteesh
New Contributor II
  • 3 kudos

Facing the same issue now. Were you able to find a fix?

  • 3 kudos
7 More Replies
prakashhinduja2
by New Contributor
  • 722 Views
  • 2 replies
  • 1 kudos

Prakash Hinduja ~ How do I create an empty DataFrame in Databricks—are there multiple ways?

Hello, I'm Prakash Hinduja, an Indian-born financial advisor and consultant based in Geneva, Switzerland (Swiss). My career is focused on guiding high-net-worth individuals and business leaders through the intricate world of global investment and wea...

  • 722 Views
  • 2 replies
  • 1 kudos
Latest Reply
ManojkMohan
Honored Contributor II
  • 1 kudos

Best Practices from Experience:Use predefined schema if you know your column types upfront—prevents errors when appending new data.For ad-hoc exploration, toDF or createDataFrame([], None) works fine.Always check printSchema()—it helps avoid silent t...

  • 1 kudos
1 More Replies
tonylax6
by New Contributor
  • 5174 Views
  • 1 replies
  • 0 kudos

Azure Databricks to Adobe Experience Platform

I'm using Azure databricks and am attempting to stream near real-time data from databricks into the Adobe Experience Platform to ingest into the AEP schema for profile enrichment.We are running into an issue with the API and streaming, so we are curr...

Data Engineering
Adobe
Adobe Experience Platform
CDP integration
  • 5174 Views
  • 1 replies
  • 0 kudos
Latest Reply
tsilverstrim
New Contributor II
  • 0 kudos

Hi Tony... there are several ways to accomplish this based on the non-functional requirements of your Use Case.  What does near real-time mean from a standpoint of signal to activation time from when the data is present in Databricks to when the acti...

  • 0 kudos
slimbnsalah
by New Contributor II
  • 2122 Views
  • 2 replies
  • 0 kudos

Use Salesforce Lakeflow Connector with a Salesforce Connected App

Hello, I'm trying to use the new Salesforce Lakeflow connector to ingest data into my Databricks account.However I see only the option to connect using a normal user, whereas I want to use a Salesforce App, just like how it is described here Run fede...

  • 2122 Views
  • 2 replies
  • 0 kudos
Latest Reply
Ajay-Pandey
Databricks MVP
  • 0 kudos

@slimbnsalah Please select Connection type as of Salesforce Data Cloud then you will be asked for details 

  • 0 kudos
1 More Replies
ManojkMohan
by Honored Contributor II
  • 879 Views
  • 4 replies
  • 2 kudos

Resolved! Silver to Gold Layer | Running ML - Debug Help Needed

Problem I am solving:Reads the raw sports data  IPL CSV → bronze layerCleans and aggregates → silver layerSummarizes team stats → gold layerPrepares ML-ready features and trains a Random Forest classifier to predict match winners Getting error: [PARS...

ManojkMohan_0-1756389913835.png
  • 879 Views
  • 4 replies
  • 2 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor III
  • 2 kudos

@ManojkMohan thanks for sharing this, I'm looking at starting an ML project in the coming weeks, I might have to bring this forward . Feeling motivated with that confusion matrix in your output .Congrats on getting it working!All the best,BS

  • 2 kudos
3 More Replies
Srinivas5
by New Contributor II
  • 836 Views
  • 6 replies
  • 3 kudos

Jar File Upload To Workspace

Spoiler  #dbfsI am unable to upload jar file dbfs to job cluster as it's deprecated now I need to upload it to workspace and install it to cluster, hower my jar size is 70mb i can't upload it through api or cli as max size is 50mb. Is there alternati...

  • 836 Views
  • 6 replies
  • 3 kudos
Latest Reply
Advika
Community Manager
  • 3 kudos

Hi @Srinivas5! Were you able to find a solution or approach that worked? If so, please mark the helpful reply as the Accepted Solution, or share your approach so others can benefit as well.

  • 3 kudos
5 More Replies
ShankarM
by Contributor
  • 430 Views
  • 2 replies
  • 0 kudos

Notebook exposure

i have created a notebook as per client requirement. I have to migrate the notebook in the client env for testing with live data but do not want to expose the Databricks notebook code to the testers in the client env.Is there a way to package the not...

  • 430 Views
  • 2 replies
  • 0 kudos
Latest Reply
WiliamRosa
Contributor III
  • 0 kudos

Hi @ShankarM,I’ve had to do something similar—packaging a Python class as a wheel. This documentation might help: https://docs.databricks.com/aws/en/dev-tools/bundles/python-wheel

  • 0 kudos
1 More Replies
DatabricksEngi1
by Contributor
  • 1155 Views
  • 2 replies
  • 1 kudos

Resolved! databricks assets bundles issue

Hii all,I’m working with Databricks Asset Bundles (DAB) and trying to move from a single repository-level bundle to a structure where each workflow (folder under resources/jobs) has its own bundle.• My repository contains:• Shared src/variables.yml a...

  • 1155 Views
  • 2 replies
  • 1 kudos
Latest Reply
DatabricksEngi1
Contributor
  • 1 kudos

I solved it.For some reason, the Terraform folder created under the bundles wasn’t set up correctly.I copied it from a working bundle, and everything completed successfully.

  • 1 kudos
1 More Replies
JPNP
by New Contributor
  • 936 Views
  • 3 replies
  • 1 kudos

Not able to creare Secret scope in Azure databricks

Hello,I am trying to create the  Azure Key Vault-backed secret scope, but it failing with the below error, I have tried to clear the cache, and logged out , used incognito browser as well but not able to create a scope. Can you please help here ? 

JPNP_0-1755692310711.jpeg
  • 936 Views
  • 3 replies
  • 1 kudos
Latest Reply
Yogesh_Verma_
Contributor II
  • 1 kudos

If the UI keeps failing with that vague error, the CLI approach suggested above is the best next step, since it usually gives a clearer error message. Also make sure that:The service principal you’re using to create the scope has Key Vault Administra...

  • 1 kudos
2 More Replies
jar
by Contributor
  • 381 Views
  • 1 replies
  • 0 kudos

Excluding job update from DAB .yml deployment

Hi.We have a range of scheduled jobs and _one_ continuous job all defined in .yml and deployed with DAB. The continuous job is paused per default and we use a scheduled job of a notebook to pause and unpause it so that it only runs during business ho...

  • 381 Views
  • 1 replies
  • 0 kudos
Latest Reply
Yogesh_Verma_
Contributor II
  • 0 kudos

You’re running into this because DAB treats the YAML definition as the source of truth — so every time you redeploy, it will reset the job state (including the paused/running status) back to what’s defined in the file. Unfortunately, there isn’t curr...

  • 0 kudos
karthik_p
by Esteemed Contributor
  • 15953 Views
  • 5 replies
  • 1 kudos

does delta live tables supports identity columns

we are able to test identity columns using sql/python, but when we are trying same using DLT, we are not seeing values under identity column. it is always empty for coloumn we created "id BIGINT GENERATED ALWAYS AS IDENTITY" 

  • 15953 Views
  • 5 replies
  • 1 kudos
Latest Reply
Gowrish
New Contributor II
  • 1 kudos

Hi,i see from the following databricks documentaion - https://docs.databricks.com/aws/en/dlt/limitationsit states the following which kind of giving an impression that you can define identity column to a steaming table Identity columns might be recom...

  • 1 kudos
4 More Replies
Labels