cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

siva_pusarla
by Visitor
  • 27 Views
  • 3 replies
  • 0 kudos

workspace notebook path not recognized by dbutils.notebook.run() when running from a workflow/job

result = dbutils.notebooks.run("/Workspace/YourFolder/NotebookA", timeout_seconds=600, arguments={"param1": "value1"}) print(result)I was able to execute the above code manually from a notebook.But when i run the same notebook as a job, it fails stat...

  • 27 Views
  • 3 replies
  • 0 kudos
Latest Reply
Poorva21
New Contributor II
  • 0 kudos

@siva_pusarla , Try to convert env_setup into repo-based code and control behavior via environmentInstead of a workspace notebook, use a Python module in the repo and drive environment differences using:Job parametersBranches (dev / test / prod)Secre...

  • 0 kudos
2 More Replies
Joost1024
by New Contributor
  • 340 Views
  • 6 replies
  • 3 kudos

Read Array of Arrays of Objects JSON file using Spark

Hi Databricks Community! This is my first post in this forum, so I hope you can forgive me if it's not according to the forum best practices After lots of searching, I decided to share the peculiar issue I'm running into in this community.I try to lo...

  • 340 Views
  • 6 replies
  • 3 kudos
Latest Reply
Joost1024
New Contributor
  • 3 kudos

I guess I was a bit over enthusiastic by accepting the answer.When I run the following on the single object array of arrays (as shown in the original post) I get a single row with column "value" and value null. from pyspark.sql import functions as F,...

  • 3 kudos
5 More Replies
ndw
by New Contributor II
  • 31 Views
  • 1 replies
  • 0 kudos

Azure databricks streamlit app unity catalog access

Hi allI am developing a Databricks app. I will use Databricks asset bundles for deployment.How can I connect Databricks streamlit app into Databricks unity catalog?Where should I define the credentials? (Databricks host for dev, qa and prod environme...

  • 31 Views
  • 1 replies
  • 0 kudos
Latest Reply
emma_s
Databricks Employee
  • 0 kudos

Hi,  As a starter you may want to try deploying the streamlit starter app from the app UI, this will show you the pattern to connect and pull data into your streamlit app. The following then gives some best practise guidelines on your questions: 1. U...

  • 0 kudos
liquibricks
by New Contributor III
  • 31 Views
  • 3 replies
  • 1 kudos

Resolved! Comments not updating on a SDP streaming table

We have a pipeline in a job which dynamically creates a set of streaming tables based on a list of kafka topics like this:       # inside a loop      @DP.table(name=table_name, comment=markdown_info)      def topic_flow(topic_name=topic_name):       ...

  • 31 Views
  • 3 replies
  • 1 kudos
Latest Reply
liquibricks
New Contributor III
  • 1 kudos

Ah, my code is correct. There was just a mistake further up when producing the comments that lead me down the wrong path. Comments (and metadata) are correctly updated as expected!

  • 1 kudos
2 More Replies
Neeraj_432
by Visitor
  • 61 Views
  • 3 replies
  • 1 kudos

Resolved! while loading data from dataframe to spark sql table using .saveAstable() option, not working.

hi , i am loading dataframe data into spark sql table using .saveastable() option.. scema is matching..but column names are diffirent in sql table. is it necessary to maintain the same column names in source and target ? how to handle it in real time...

  • 61 Views
  • 3 replies
  • 1 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 1 kudos

If your pipeline is mostly PySpark/Scala, rename columns in the DataFrame to match the target and use df.write.saveAsTable. If your pipeline is mostly SQL (e.g., on SQL Warehouses), use INSERT … BY NAME from a temp view (or table).Performance is broa...

  • 1 kudos
2 More Replies
ScottH
by New Contributor III
  • 145 Views
  • 1 replies
  • 0 kudos

Can I create a serverless budget policy via Python SDK on Azure Databricks?

Hi, I am trying to use the Databricks Python SDK (v0.74.0) to automate the creation of budget policies in our Databricks account. See the Python code below where I am trying to create a serverless budget policy. Note the error.When I click the "Diagn...

ScottH_0-1766168891911.png
  • 145 Views
  • 1 replies
  • 0 kudos
Latest Reply
emma_s
Databricks Employee
  • 0 kudos

Hi, from the documentation I've found internally, as this feature is still in public previewbudget policy creation via the SDK is not currently supported. You can try using it via the rest API instead however this also could not yet be rolled out to ...

  • 0 kudos
vinaykumar
by New Contributor III
  • 11246 Views
  • 9 replies
  • 0 kudos

Log files are not getting deleted automatically after logRetentionDuration internal

Hi team Log files are not getting deleted automatically after logRetentionDuration internal from delta log folder and after analysis , I see checkpoint files are not getting created after 10 commits . Below table properties using spark.sql(    f"""  ...

No checkpoint.parquet
  • 11246 Views
  • 9 replies
  • 0 kudos
Latest Reply
alex307
New Contributor II
  • 0 kudos

Any body get any solution?

  • 0 kudos
8 More Replies
s_agarwal
by New Contributor
  • 176 Views
  • 1 replies
  • 0 kudos

Queries from Serverless compute referring to older/deleted/vacuumed version of the delta tables.

Hi Team,I have a unity catalog based managed delta table which I am able to successfully query using the regular compute/cluster options.But when I try to query the same table using a Serverless/SQL Warehouse, they are referring to an older version /...

  • 176 Views
  • 1 replies
  • 0 kudos
Latest Reply
Saritha_S
Databricks Employee
  • 0 kudos

Hi @s_agarwal  Please find below my findinsg for your query.  Serverless uses cached Unity Catalog metadata Your UC metadata points to an old Delta version Regular clusters bypass this cache Fix by refreshing or forcing UC metadata rewrite

  • 0 kudos
seefoods
by Valued Contributor
  • 164 Views
  • 1 replies
  • 0 kudos

spark conf for serveless jobs

Hello Guys, I use serveless on databricks Azure, so i have build a decorator which instanciate a SparkSession. My job use autolaoder / kafka using mode availableNow. Someone Knows which spark conf is required beacause i want to add it  ? Thanx import...

  • 164 Views
  • 1 replies
  • 0 kudos
Latest Reply
Saritha_S
Databricks Employee
  • 0 kudos

Hi @seefoods  Please find below my findings for your case. You don’t need (and can’t meaningfully add) any Spark conf to enable availableNow on Databricks Serverless. Let me explain clearly, and then show what is safe to do in your decorator. availa...

  • 0 kudos
Maxrb
by New Contributor
  • 255 Views
  • 7 replies
  • 2 kudos

pkgutils walk_packages stopped working in DBR 17.2

Hi,After moving from Databricks runtime 17.1 to 17.2 suddenly my pkgutils walk_packages doesn't identify any packages within my repository anymore.This is my example code:import pkgutil import os packages = pkgutil.walk_packages([os.getcwd()]) print...

  • 255 Views
  • 7 replies
  • 2 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 2 kudos

Hey @Maxrb , Just thinking out loud here, but this might be worth experimenting with. You could try using a Unity Catalog Volume as a lightweight package repository. Volumes can act as a secure, governed home for Python wheels (and JARs), and Databri...

  • 2 kudos
6 More Replies
jpassaro
by New Contributor
  • 139 Views
  • 1 replies
  • 1 kudos

does databricks respect parallel vacuum setting?

I am trying to run VACUUM on a delta table that i know has millions of obselete files.out of the box, VACUUM runs the deletes in sequence on the driver. that is bad news for me!According to OSS delta docs, the setting spark.databricks.delta.vacuum.pa...

  • 139 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Greetings @jpassaro ,  Thanks for laying out the context and the links. Let me clarify what’s actually happening here and how I’d recommend moving forward. Short answer No. On Databricks Runtime, the spark.databricks.delta.vacuum.parallelDelete.enabl...

  • 1 kudos
ismaelhenzel
by Contributor III
  • 65 Views
  • 0 replies
  • 0 kudos

Declarative Pipelines - Dynamic Overwrite

Regarding the limitations of declarative pipelines—specifically the inability to use replaceWhere—I discovered through testing that materialized views actually support dynamic overwrites. This handles several scenarios where replaceWhere would typica...

  • 65 Views
  • 0 replies
  • 0 kudos
oye
by New Contributor II
  • 146 Views
  • 3 replies
  • 0 kudos

Unavailable GPU compute

Hello,I would like to create a ML compute with GPU. I am on GCP europe-west1 and the only available options for me are the G2 family and one instance of the A3 family (a3-highgpu-8g [H100]). I have been trying multiple times at different times but I ...

  • 146 Views
  • 3 replies
  • 0 kudos
Latest Reply
SP_6721
Honored Contributor II
  • 0 kudos

Hi @oye ,You’re hitting a cloud capacity issue, not a Databricks configuration problem. The Databricks GCP GPU docs list A2 and G2 as the supported GPU instance families. A3/H100 is not in the supported list: https://docs.databricks.com/gcp/en/comput...

  • 0 kudos
2 More Replies
Sunil_Patidar
by New Contributor
  • 133 Views
  • 1 replies
  • 1 kudos

Unable to read from or write to Snowflake Open Catalog via Databricks

I have Snowflake Iceberg tables whose metadata is stored in Snowflake Open Catalog. I am trying to read these tables from the Open Catalog and write back to the Open Catalog using Databricks.I have explored the available documentation but haven’t bee...

  • 133 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Greetings @Sunil_Patidar ,  Databricks and Snowflake can interoperate cleanly around Iceberg today — but how you do it matters. At a high level, interoperability works because both platforms meet at Apache Iceberg and the Iceberg REST Catalog API. Wh...

  • 1 kudos
969091
by New Contributor II
  • 37685 Views
  • 11 replies
  • 10 kudos

Send custom emails from databricks notebook without using third party SMTP server. Would like to utilize databricks existing smtp or databricks api.

We want to use existing databricks smtp server or if databricks api can used to send custom emails. Databricks Workflows sends email notifications on success, failure, etc. of jobs but cannot send custom emails. So we want to send custom emails to di...

  • 37685 Views
  • 11 replies
  • 10 kudos
Latest Reply
Shivaprasad
Contributor
  • 10 kudos

Did you able to get the custom email working from databricks notebook. I was trying but was not successful. let me know

  • 10 kudos
10 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels