cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Ipshi
by New Contributor
  • 701 Views
  • 1 replies
  • 0 kudos

databricks Data Engineer associate

Hi everyone , can anyone guide me about any test papers or any test materials anyone can go through for the databricks data engineer associate exam 

  • 701 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Databricks Employee
  • 0 kudos

Hello @Ipshi! You can find resources for the Databricks Certified Data Engineer Associate exam in the Getting Ready for the Exam section of the exam-specific webpage on the website. This section includes a detailed list of topics covered and sample q...

  • 0 kudos
lawrence009
by Contributor
  • 2861 Views
  • 4 replies
  • 0 kudos

Blank Page after Logging In

On Feb 8 Singapore time, our Singapore workspace displayed a blank page (no interface or content) after login. Meanwhile our workspace in Tokyo reason worked normally. This lasted whole day and none of our troubleshooting yielded any clues. Then ever...

  • 2861 Views
  • 4 replies
  • 0 kudos
Latest Reply
ciro
New Contributor II
  • 0 kudos

After logging in, I’m getting a white screen, and it won’t load. I’ve tried clearing my cache and switching browsers, but nothing seems to work. This feels like something that really needs to be looked into. Has anyone figured out a way to fix it?

  • 0 kudos
3 More Replies
pargit2
by New Contributor II
  • 795 Views
  • 1 replies
  • 0 kudos

feature store delta sharing

Hi, I have 2 workspaces one for data engineers and one for data science team and I need to create in data engineering workspace the bronze and silver.I want to built them a feature store should I do it from data science workspace or data engineering ...

  • 795 Views
  • 1 replies
  • 0 kudos
Latest Reply
ciro
New Contributor II
  • 0 kudos

I like the idea of using Feature Store with Delta Sharing, but I’m a bit worried about its limits like no partition filtering and no streaming support. These could cause problems with performance and scaling in real situations.

  • 0 kudos
thisisadarshsin
by New Contributor II
  • 6320 Views
  • 12 replies
  • 0 kudos

Permission issue in Fundamentals of the Databricks Lakehouse Platform Quiz

Hi ,I am getting this Error,when i am trying to give the exam ofFundamentals of the Databricks Lakehouse Platform.403FORBIDDENYou don't have permission to access this page2023-05-20 12:37:41 | Error 403 | https://customer-academy.databricks.com/I al...

  • 6320 Views
  • 12 replies
  • 0 kudos
Latest Reply
Advika_
Databricks Employee
  • 0 kudos

Hello, everyone! We are sorry to hear you're having trouble accessing the Quiz. Please note that the Lakehouse Fundamentals course has been replaced by Databricks Fundamentals along with the updated content.Try logging into your account directly by u...

  • 0 kudos
11 More Replies
NathanC0926
by New Contributor
  • 1121 Views
  • 1 replies
  • 0 kudos

Delta Live Table (Streaming Tables) for excel (.xlsx, .xls)

What's the native way to ingest excel files using a streaming table? I wish that when the excel files land in unity catalog, it can pick up those and load it in to the Streaming Table. Data is Small, so we can afford some kind of UDF, but we really n...

  • 1121 Views
  • 1 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Honored Contributor III
  • 0 kudos

Hi @NathanC0926 Ingesting Excel files with streaming tables requires a combination of Databricks Autoloader(for file discovery and exactly-once processing) and a custom UDF for Excel parsing.Here's the native approachKey Features of This Solution1. E...

  • 0 kudos
FranPérez
by New Contributor III
  • 15564 Views
  • 8 replies
  • 4 kudos

set PYTHONPATH when executing workflows

I set up a workflow using 2 tasks. Just for demo purposes, I'm using an interactive cluster for running the workflow. { "task_key": "prepare", "spark_python_task": { "python_file": "file...

  • 15564 Views
  • 8 replies
  • 4 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 4 kudos

Hi @Fran Pérez​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 4 kudos
7 More Replies
lumen
by New Contributor II
  • 1219 Views
  • 3 replies
  • 1 kudos

Notebook ID in DESCRIBE HISTORY not showing

We've recently installed Databricks 14.3 LTS with Unity Catalog and for some reason that is escaping me the Notebook Id is not showing up when I execute the DESCRIBE HISTORY SQL command. Example below for table test_catalog.lineagedemo.lm_lineage_tes...

Image 23-05-2025 at 14.18.png
  • 1219 Views
  • 3 replies
  • 1 kudos
Latest Reply
lumen
New Contributor II
  • 1 kudos

Hi @RameshRetnasamy first off thank you so much for taking the time to reply to my question. In my case they were indeed created via Notebooks, but I'll re-evaluate on my end as I might've missed something. If the issue persists, I'll re-assert the q...

  • 1 kudos
2 More Replies
Vinoth_nirmal
by New Contributor II
  • 1126 Views
  • 4 replies
  • 0 kudos

Not able create and start a cluster

Hi Team,I am trying to use community edition for learning below is my URL detailshttps://community.cloud.databricks.com/compute/interactive?o=2059657917292434Due to some reason my clusters are taking nearly 45 to 60 minutes for creation and after if ...

  • 1126 Views
  • 4 replies
  • 0 kudos
Latest Reply
Vinoth_nirmal
New Contributor II
  • 0 kudos

Hi @nikhilj0421  still the same issue, i could see its working till spark 15.4 LTS above 15.4LTS whatever version i useits not woring i am not able to create any cluster.

  • 0 kudos
3 More Replies
hao-uit
by New Contributor
  • 2841 Views
  • 1 replies
  • 0 kudos

Spark Streaming Job gets stuck in the "Stream Initializing"

Hello all,I am having an issue with my Spark Streaming Job. It is stuck at "Stream Initializing" stage.Need your help here to understand what is happening inside the "Stream Initializing" stage of Spark Streaming job which is taking so long. Here are...

  • 2841 Views
  • 1 replies
  • 0 kudos
Latest Reply
nikhilj0421
Databricks Employee
  • 0 kudos

Hi @hao-uit, do you see any kind of load on the driver and event logs? Also, what libraries you have installed on your cluster? 

  • 0 kudos
dev_puli
by New Contributor III
  • 52612 Views
  • 7 replies
  • 8 kudos

how to read the CSV file from users workspace

Hi!I have been carrying out a POC, so I created the CSV file in my workspace and tried to read the content using the techniques below in a Python notebook, but did not work.Option1:repo_file = "/Workspace/Users/u1@org.com/csv files/f1.csv"tmp_file_na...

  • 52612 Views
  • 7 replies
  • 8 kudos
Latest Reply
MujtabaNoori
New Contributor III
  • 8 kudos

Hi @Dev ,Generally, What happens spark reader APIs point to the DBFS by default. And, to read the file from User workspace, we need to append 'file:/' in the prefix.Thanks

  • 8 kudos
6 More Replies
YoyoHappy
by New Contributor
  • 444 Views
  • 0 replies
  • 0 kudos

An Equivalent Implementation of XIRR in Excel within PySpark

IntroductionIRR is a common method used by financial personnel to evaluate the financial benefits of project investment, and its formula is as follows.In excel, you can use the XIRR function to calculate the IRR of non-periodic cash flows easily.But ...

YoyoHappy_0-1748015758470.png YoyoHappy_1-1748015783803.png YoyoHappy_2-1748016013807.png YoyoHappy_3-1748016041346.png
  • 444 Views
  • 0 replies
  • 0 kudos
JameDavi_51481
by Contributor
  • 2154 Views
  • 7 replies
  • 1 kudos

parameterized ALTER TABLE SET TAGS

I would like to use parametrized sql queries to run SET TAGs commands on tables, but cannot figure out how to parameterize the query to prevent SQL injection. Both the `?` and `:key` parameter syntaxes throw a syntax errorBasically, I'd like to do th...

  • 2154 Views
  • 7 replies
  • 1 kudos
Latest Reply
JameDavi_51481
Contributor
  • 1 kudos

I already said I was running 16.1 for my tests.

  • 1 kudos
6 More Replies
Harshul
by New Contributor
  • 797 Views
  • 1 replies
  • 0 kudos

Merge Performance Issues

The issues we are currently facing in our project is Slow Merge Performance.Production database has 50 billion records (Historical data), data is partitioned(date) but not indexed.Incremental data is has close 250 to 500 million records. Incremental ...

Data Engineering
Insert Overwrite
MERGE
performance
  • 797 Views
  • 1 replies
  • 0 kudos
Latest Reply
Renu_
Valued Contributor II
  • 0 kudos

Hi @Harshul, 1. Yes, if configured properly, INSERT OVERWRITE helps maintain data consistency. You can deduplicate incremental data using ROW_NUMBER() and use INSERT OVERWRITE with replaceWhere for efficient daily bulk updates.2. Usually lower, since...

  • 0 kudos
DanielW
by New Contributor III
  • 2269 Views
  • 12 replies
  • 3 kudos

Resolved! Databricks Rest api swagger definition not handling bigint or integer

I want to test create a custom connector in a Power App that connects to table in Databricks.  The issue is if I have any columns like int or bigint. No matter what I define in the response in my swagger definition See  below), it is not correct type...

DanielW_0-1747312458356.png DanielW_0-1747313218694.png
  • 2269 Views
  • 12 replies
  • 3 kudos
Latest Reply
DanielW
New Contributor III
  • 3 kudos

Hi @lingareddy_Alva This might warrant another post to keep the conversation focussed, but I found a couple of things with the custom connector that make it a bit cumbersome to use.1) I don't seem to be able to have two post operations under /statmen...

  • 3 kudos
11 More Replies
chexa_Wee
by New Contributor III
  • 2543 Views
  • 2 replies
  • 2 kudos

How to Implement Incremental Loading in Azure Databricks for ETL

Hi everyone,I'm currently working on an ETL process using Azure Databricks (Standard Tier) where I load data from Azure SQL Database into Databricks. I run a notebook daily to extract, transform, and load the data for Power BI reports.Right now, the ...

  • 2543 Views
  • 2 replies
  • 2 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 2 kudos

In case you do not want to use dlt (and there are reasons not to), you can also check the docs for autoloader and merge notebooksThese 2 do basically the same as dlt but without the extra cost and more control.  You have to write more code though.For...

  • 2 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels