cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 
🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

We’re excited to invite you to our MONTHLY Databricks Get Started Days, a half-day virtual learning experience designed to jumpstart your journey with Databricks. This global event is tailored to equip you with essential data engineering and analysi...

  • 102 Views
  • 0 replies
  • 0 kudos
10 hours ago
Virtual Learning Festival: 9 April - 30 April

Join us for the return of the Virtual Learning Festival! Mark your calendars from 9 April - 30 April 2025! Upskill across data engineering, data analysis, machine learning, and generative AI. Join the thousands who have elevated their career with...

  • 17305 Views
  • 50 replies
  • 16 kudos
03-03-2025
Intelligent Data Warehousing: AI/BI for Self-service Analytics

 Date & Time:8 April (Tuesday), 11 AM – 3 PM SGT Registration Page:Register Here: https://events.databricks.com/FY260408-SD-AIBIforSelf-ServiceAnalytics  What You'll Learn:In this training session, you will learn how to self-serve business insights ...

  • 2098 Views
  • 1 replies
  • 2 kudos
2 weeks ago
Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Learn the foundational principles of the lakehouse architecture with three short self-paced videos The architecture of your data platform directly affects the reliability, performance and scalability of your data and AI initiatives. The lakehouse arc...

  • 4909 Views
  • 0 replies
  • 2 kudos
a month ago
Data + AI Summit 2025 — registration now open!

Be part of a global movement! Connect with 22,000 data enthusiasts at 700+ sessions, keynotes, and hands-on training at this year’s Data + AI Summit. Whether you’re into data intelligence, governance, AI, or data warehousing, this is your chance to l...

  • 5816 Views
  • 1 replies
  • 3 kudos
02-19-2025
Databricks Community Champion - March 2025 - Takuya Omi

Meet Takuya Omi (尾美拓哉), a passionate Data Engineer at NS Solutions and a valuable member of the Databricks Community. Takuya's career journey began in the marketing division of a financial institution, where his early work with SQL and machine learni...

  • 2707 Views
  • 4 replies
  • 6 kudos
Monday

Community Activity

eimis_pacheco
by > Contributor
  • 323 Views
  • 2 replies
  • 1 kudos

Databricks AI + Data Summit discount coupon

Hi Community,I hope you're doing well.I wanted to ask you the following: I want to go to Databricks AI + Data Summit this year, but it's super expensive for me. And hotels in San Francisco, as you know, are super expensive.So, I wanted to know how I ...

  • 323 Views
  • 2 replies
  • 1 kudos
Latest Reply
eimis_pacheco
Contributor
  • 1 kudos

Thank you for your answer. Thanks

  • 1 kudos
1 More Replies
Katalin555
by > New Contributor II
  • 133 Views
  • 1 replies
  • 0 kudos

Found a potential bug in Job Details/Schedule and Trigger section

One of our jobs is scheduled to run at 4:30 AM based on GMT+1 timezone, which is visible if we click on the Edit trigger (Picture1), but under job details it is show as if it was schedule to run at 4:30 AM UTC time (Picture 2).Based on previous runs ...

Katalin555_0-1743586717525.jpeg Katalin555_1-1743586768998.jpeg Katalin555_3-1743587093136.png
  • 133 Views
  • 1 replies
  • 0 kudos
Latest Reply
Isi
Contributor
  • 0 kudos

Hey @Katalin555 Even though in the “Edit Trigger” panel (Picture 2) the time is shown in local timezone (e.g. GMT+1), once the schedule is saved and viewed under job details (Picture 1), Databricks always displays it as UTC — without making it visual...

  • 0 kudos
amitkamthane
by > New Contributor
  • 164 Views
  • 2 replies
  • 0 kudos

Delete files from databricks Volumes based on trigger

Hi,I noticed there's a file arrival trigger option in the workflow but cant see delete trigger option. However, let's say I want to delete files from the Databricks volume based on this trigger, and also remove the corresponding records from the bron...

  • 164 Views
  • 2 replies
  • 0 kudos
Latest Reply
BigRoux
Databricks Employee
  • 0 kudos

Currently, Databricks doesn’t offer a built-in file deletion trigger mechanism similar to the file arrival trigger. The file arrival trigger only monitors for new files being added to a location, not for files being deleted.

  • 0 kudos
1 More Replies
zmsoft
by > Contributor
  • 141 Views
  • 1 replies
  • 0 kudos

How to set DLT pipeline warning alert?

Hi there,The example description of custom event hooks in the documentation is not clear enough, I do not know how to implement it inside python functions. event-hooks  My Code: %python # Read the insertion of data raw_user_delta_streaming=spark.rea...

  • 141 Views
  • 1 replies
  • 0 kudos
Latest Reply
Priyanka_Biswas
Databricks Employee
  • 0 kudos

Hi @zmsoft  The event hook provided, user_event_hook, must be a Python callable that accepts exactly one parameter - a dictionary representation of the event that triggered the execution of this event hook. The return value of the event hook has no s...

  • 0 kudos
Guigui
by > New Contributor II
  • 60 Views
  • 2 replies
  • 0 kudos

Package installation for multi-tasks job

I have a job with the same task to be executed twice with two sets of parameters. In each task is run after cloning a git repo then installing it locally and running a notebook from this repo. However, as each task clones the same repo, I was wonderi...

  • 60 Views
  • 2 replies
  • 0 kudos
Latest Reply
Guigui
New Contributor II
  • 0 kudos

That what I've done, but I find it less elegant that setup an environment and sharing it across multiple tasks. It seems to be impossible (unless I build a wheel file and I dont want to) as tasks do not share environment, but anyway, as they run in p...

  • 0 kudos
1 More Replies
Eric_Kieft
by > New Contributor III
  • 281 Views
  • 5 replies
  • 4 kudos

Centralized Location of Table History/Timestamps in Unity Catalog

Is there a centralized location in Unity Catalog that retains the table history, specifically the last timestamp, for managed delta tables?DESCRIBE HISTORY will provide it for a specific table, but I would like to get it for a number of tables.inform...

  • 281 Views
  • 5 replies
  • 4 kudos
Latest Reply
Priyanka_Biswas
Databricks Employee
  • 4 kudos

Hi @Eric_Kieft @noorbasha534  system.access.table_lineage includes a record for each read or write event on a Unity Catalog table or path. This includes but is not limited to job runs, notebook runs, and dashboards updated with the read or write even...

  • 4 kudos
4 More Replies
cmathieu
by > New Contributor III
  • 31 Views
  • 0 replies
  • 0 kudos

OPTIMIZE command on heavily nested table OOM error

I'm trying to run the OPTIMIZE command on a table with less than 2000 rows, but it is causing an out of memory issue. The problem seems to come from the fact that it is a heavily nested table in staging between a json file and flattened table. The ta...

  • 31 Views
  • 0 replies
  • 0 kudos
noorbasha534
by > Contributor II
  • 32 Views
  • 0 replies
  • 0 kudos

Events subscription in Databricks delta tables

Dear all, We are maintaining a global/enterprise data platform for a customer.We like to capture events based on data streaming happening on Databricks based delta tables. ((data streams run for at least 15 hrs a day; so, events should be generated b...

  • 32 Views
  • 0 replies
  • 0 kudos
BigAlThePal
by > Visitor
  • 34 Views
  • 0 replies
  • 0 kudos

.py file running stuck on waiting

Hello, hope you are doing well.We are facing an issue when running .py files. This is fairly recent and we were not experiencing this issue last week.As shown in the screenshots below, the .py file hangs on "waiting" after we press "run all". No matt...

BigAlThePal_0-1743709475328.png
  • 34 Views
  • 0 replies
  • 0 kudos
ChristianRRL
by > Valued Contributor
  • 80 Views
  • 1 replies
  • 0 kudos

DBX Community Pending Answers

Hi there, in the past I've posted questions in this community and I would consistently get responses back in a very reasonable time frame. Typically I think most of my posts have an initial response back within 1-2 days, or just a few days (I don't t...

  • 80 Views
  • 1 replies
  • 0 kudos
Latest Reply
Sujitha
Databricks Employee
  • 0 kudos

Hi @ChristianRRL Thanks for reaching out and for being an active member of the Databricks Community! Your approach to posting is correct, and there haven’t been any major changes. However, for highly technical questions, we sometimes need to consult ...

  • 0 kudos
Carl_B
by > New Contributor II
  • 96 Views
  • 1 replies
  • 0 kudos

import openai | challenges

Hello World,I am facing some challenges running OpenAI on Databricks using Azure. I can easily pip install openai.However, problems start once I run the python line: import openai. After that, I can a series of errors and never ending error messages....

  • 96 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16611530679
Databricks Employee
  • 0 kudos

Hi @Carl_B , Good Day!I understand that you are trying to import the OpenAI client but that fails. However, can you please try installing as specified below and let us know how it goes?    !pip install --upgrade openai dbutils.library.restartPython(...

  • 0 kudos
Mado
by > Valued Contributor II
  • 4934 Views
  • 1 replies
  • 4 kudos

Error when reading Excel file: "org.apache.poi.ooxml.POIXMLException: Strict OOXML isn't currently supported, please see bug #57699"

Hi,I want to read an Excel "xlsx" file. The excel file has several sheets and multi-row header. The original file format was "xlsm" and I changed the extension to "xlsx". I try the following code:filepath_xlsx = "dbfs:/FileStore/Sample_Excel/data.xl...

  • 4934 Views
  • 1 replies
  • 4 kudos
Latest Reply
Eag_le
Visitor
  • 4 kudos

copying the data onto a newer file solved my issue. Likely issue related to files metadata!   

  • 4 kudos
dbernabeuplx
by > New Contributor
  • 103 Views
  • 2 replies
  • 0 kudos

How to delete/empty notebook output

I need to clear cell output in Databricks notebooks using dbutils or the API. As for my requirements, I need to clear it for data security reasons. That is, given a notebook's PATH, I would like to be able to clear all its outputs, as is done through...

Data Engineering
API
Data
issue
Notebooks
  • 103 Views
  • 2 replies
  • 0 kudos
Latest Reply
srinum89
New Contributor
  • 0 kudos

For Programmatic approach, you can also clear the each cell output individually using IPython package. Unfortunately, you need to do this in each and every cell. from IPython.display import clear_output clear_output(wait=True) 

  • 0 kudos
1 More Replies
William_Scardua
by > Valued Contributor
  • 129 Views
  • 1 replies
  • 0 kudos

Upsert from Databricks to CosmosDB

Hi guys,I'm adjusting a data upsert process from Databricks to CosmosDB using the .jar connector. As the load is very large, do you know if it's possible to change only the fields that have been modified?best regards

  • 129 Views
  • 1 replies
  • 0 kudos
Latest Reply
BigRoux
Databricks Employee
  • 0 kudos

Yes, you can update only the modified fields in your Cosmos DB documents from Databricks using the Partial Document Update feature (also known as Patch API). This is particularly useful for large documents where sending the entire document for update...

  • 0 kudos
397973
by > New Contributor III
  • 165 Views
  • 1 replies
  • 1 kudos

What's the best way to get from Python dict > JSON > PySpark and apply as a mapping to a dataframe?

I'm migrating code from Python Linux to Databricks PySpark. I have many mappings like this: {    "main": {    "honda": 1.0,    "toyota": 2.9,    "BMW": 5.77,    "Fiat": 4.5,    },}I exported using json.dump, saved to s3 and was able to import with sp...

397973_0-1743620626332.png
  • 165 Views
  • 1 replies
  • 1 kudos
Latest Reply
BigRoux
Databricks Employee
  • 1 kudos

For migrating your Python dictionary mappings to PySpark, you have several good options. Let's examine the approaches and identify the best solution. Using F.create_map (Your Current Approach) Your current approach using `F.create_map` is actually qu...

  • 1 kudos
Welcome to the Databricks Community!

Once you are logged in, you will be ready to post content, ask questions, participate in discussions, earn badges and more.

Spend a few minutes exploring Get Started Resources, Learning Paths, Certifications, and Platform Discussions.

Connect with peers through User Groups and stay updated by subscribing to Events. We are excited to see you engage!

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Read Databricks Data Intelligence Platform reviews on G2

Latest from our Blog

Migrating ML Workload to Databricks

As machine learning (ML) workloads continue to grow in complexity and scale, organizations are looking for efficient and scalable solutions to manage their ML lifecycle. Databricks offers a powerful p...

62Views 0kudos