cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Ved88
by Databricks Partner
  • 653 Views
  • 2 replies
  • 0 kudos

power BI Vnet data gateway to Databricks using import mode

we are using Power Bi Vnet data gateway and data source connection as databricks and using import mode.databricks is behind Vnet.refreshing model working fine for 400 records but larger volume throwing errors.i tried with different way kind of increm...

  • 653 Views
  • 2 replies
  • 0 kudos
Latest Reply
Ved88
Databricks Partner
  • 0 kudos

Hi @szymon_dybczak thanks but that is what we set when we do make power Bi desktop model ,i used this query only and made semantic model in power BI desktop and then we did Publish this into power bi service and do the refresh in web UI there it is f...

  • 0 kudos
1 More Replies
fkseki
by Contributor
  • 1246 Views
  • 7 replies
  • 7 kudos

Resolved! List budget policies applying filter_by

I'm trying to list budget policies using the parameter "filter_by" to filter policies that start with "aaaa" but I'm getting an error  "400 Bad Request"{'error_code': 'MALFORMED_REQUEST', 'message': "Could not parse request object: Expected 'START_OB...

  • 1246 Views
  • 7 replies
  • 7 kudos
Latest Reply
fkseki
Contributor
  • 7 kudos

Thanks for the reply, @szymon_dybczak and @lingareddy_Alva.I tried both approaches but none was successful.url = f'{account_url}/api/2.1/accounts/{account_id}/budget-policies'filter_by_json = json.dumps({"policy_name": "aaaa"})params = {"filter_by": ...

  • 7 kudos
6 More Replies
ss_data_eng
by New Contributor
  • 1772 Views
  • 4 replies
  • 0 kudos

Using Lakehouse Federation for SQL Server with Serverless Compute

Hi,My team was able to create a Foreign Catalog that connects to a SQL Server instance hosted on an Azure VM, however when trying to query the catalog, we cannot access it using serverless compute (or a serverless sql warehouse). We have tried lookin...

  • 1772 Views
  • 4 replies
  • 0 kudos
Latest Reply
Ralf
New Contributor II
  • 0 kudos

I'm trying to get something similar to work: Lakehouse Federation for Oracle with SQL warehouse serverless. We are using Azure Databricks and our Oracle DB runs on-prem. I've been able to use classic compute to query the database, but now I'd like to...

  • 0 kudos
3 More Replies
orcation
by New Contributor III
  • 2313 Views
  • 3 replies
  • 5 kudos

Resolved! Why Does Azure Databricks Consume So Much Memory When Running in the Background?

I had two Azure Databricks pages open in my browser without performing any computations. When I returned from lunch, I noticed that they were occupying about 80% of the memory in the task manager. What happened? This issue never occurred in the past,...

Snipaste_2025-09-09_13-50-09.png
  • 2313 Views
  • 3 replies
  • 5 kudos
Latest Reply
dj4
New Contributor II
  • 5 kudos

@szymon_dybczak This issue still exists and is getting worse. Even a 32GB memory & ultra 7 processor laptop cannot seem to handle this issue if there are many cells in the notebook. Do you know when it'll be fixed?

  • 5 kudos
2 More Replies
Joost1024
by New Contributor III
  • 2358 Views
  • 6 replies
  • 4 kudos

Resolved! Read Array of Arrays of Objects JSON file using Spark

Hi Databricks Community! This is my first post in this forum, so I hope you can forgive me if it's not according to the forum best practices After lots of searching, I decided to share the peculiar issue I'm running into in this community.I try to lo...

  • 2358 Views
  • 6 replies
  • 4 kudos
Latest Reply
Joost1024
New Contributor III
  • 4 kudos

I guess I was a bit over enthusiastic by accepting the answer.When I run the following on the single object array of arrays (as shown in the original post) I get a single row with column "value" and value null. from pyspark.sql import functions as F,...

  • 4 kudos
5 More Replies
ndw
by New Contributor III
  • 862 Views
  • 1 replies
  • 1 kudos

Resolved! Azure databricks streamlit app unity catalog access

Hi allI am developing a Databricks app. I will use Databricks asset bundles for deployment.How can I connect Databricks streamlit app into Databricks unity catalog?Where should I define the credentials? (Databricks host for dev, qa and prod environme...

  • 862 Views
  • 1 replies
  • 1 kudos
Latest Reply
emma_s
Databricks Employee
  • 1 kudos

Hi,  As a starter you may want to try deploying the streamlit starter app from the app UI, this will show you the pattern to connect and pull data into your streamlit app. The following then gives some best practise guidelines on your questions: 1. U...

  • 1 kudos
liquibricks
by Databricks Partner
  • 468 Views
  • 3 replies
  • 3 kudos

Resolved! Comments not updating on a SDP streaming table

We have a pipeline in a job which dynamically creates a set of streaming tables based on a list of kafka topics like this:       # inside a loop      @DP.table(name=table_name, comment=markdown_info)      def topic_flow(topic_name=topic_name):       ...

  • 468 Views
  • 3 replies
  • 3 kudos
Latest Reply
liquibricks
Databricks Partner
  • 3 kudos

Ah, my code is correct. There was just a mistake further up when producing the comments that lead me down the wrong path. Comments (and metadata) are correctly updated as expected!

  • 3 kudos
2 More Replies
Neeraj_432
by New Contributor II
  • 631 Views
  • 3 replies
  • 1 kudos

Resolved! while loading data from dataframe to spark sql table using .saveAstable() option, not working.

hi , i am loading dataframe data into spark sql table using .saveastable() option.. scema is matching..but column names are diffirent in sql table. is it necessary to maintain the same column names in source and target ? how to handle it in real time...

  • 631 Views
  • 3 replies
  • 1 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 1 kudos

If your pipeline is mostly PySpark/Scala, rename columns in the DataFrame to match the target and use df.write.saveAsTable. If your pipeline is mostly SQL (e.g., on SQL Warehouses), use INSERT … BY NAME from a temp view (or table).Performance is broa...

  • 1 kudos
2 More Replies
ScottH
by New Contributor III
  • 660 Views
  • 1 replies
  • 1 kudos

Resolved! Can I create a serverless budget policy via Python SDK on Azure Databricks?

Hi, I am trying to use the Databricks Python SDK (v0.74.0) to automate the creation of budget policies in our Databricks account. See the Python code below where I am trying to create a serverless budget policy. Note the error.When I click the "Diagn...

ScottH_0-1766168891911.png
  • 660 Views
  • 1 replies
  • 1 kudos
Latest Reply
emma_s
Databricks Employee
  • 1 kudos

Hi, from the documentation I've found internally, as this feature is still in public previewbudget policy creation via the SDK is not currently supported. You can try using it via the rest API instead however this also could not yet be rolled out to ...

  • 1 kudos
vinaykumar
by Databricks Partner
  • 12366 Views
  • 9 replies
  • 0 kudos

Log files are not getting deleted automatically after logRetentionDuration internal

Hi team Log files are not getting deleted automatically after logRetentionDuration internal from delta log folder and after analysis , I see checkpoint files are not getting created after 10 commits . Below table properties using spark.sql(    f"""  ...

No checkpoint.parquet
  • 12366 Views
  • 9 replies
  • 0 kudos
Latest Reply
alex307
New Contributor II
  • 0 kudos

Any body get any solution?

  • 0 kudos
8 More Replies
s_agarwal
by New Contributor
  • 476 Views
  • 1 replies
  • 0 kudos

Queries from Serverless compute referring to older/deleted/vacuumed version of the delta tables.

Hi Team,I have a unity catalog based managed delta table which I am able to successfully query using the regular compute/cluster options.But when I try to query the same table using a Serverless/SQL Warehouse, they are referring to an older version /...

  • 476 Views
  • 1 replies
  • 0 kudos
Latest Reply
Saritha_S
Databricks Employee
  • 0 kudos

Hi @s_agarwal  Please find below my findinsg for your query.  Serverless uses cached Unity Catalog metadata Your UC metadata points to an old Delta version Regular clusters bypass this cache Fix by refreshing or forcing UC metadata rewrite

  • 0 kudos
seefoods
by Valued Contributor
  • 338 Views
  • 1 replies
  • 0 kudos

spark conf for serveless jobs

Hello Guys, I use serveless on databricks Azure, so i have build a decorator which instanciate a SparkSession. My job use autolaoder / kafka using mode availableNow. Someone Knows which spark conf is required beacause i want to add it  ? Thanx import...

  • 338 Views
  • 1 replies
  • 0 kudos
Latest Reply
Saritha_S
Databricks Employee
  • 0 kudos

Hi @seefoods  Please find below my findings for your case. You don’t need (and can’t meaningfully add) any Spark conf to enable availableNow on Databricks Serverless. Let me explain clearly, and then show what is safe to do in your decorator. availa...

  • 0 kudos
Maxrb
by New Contributor III
  • 900 Views
  • 7 replies
  • 2 kudos

pkgutils walk_packages stopped working in DBR 17.2

Hi,After moving from Databricks runtime 17.1 to 17.2 suddenly my pkgutils walk_packages doesn't identify any packages within my repository anymore.This is my example code:import pkgutil import os packages = pkgutil.walk_packages([os.getcwd()]) print...

  • 900 Views
  • 7 replies
  • 2 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 2 kudos

Hey @Maxrb , Just thinking out loud here, but this might be worth experimenting with. You could try using a Unity Catalog Volume as a lightweight package repository. Volumes can act as a secure, governed home for Python wheels (and JARs), and Databri...

  • 2 kudos
6 More Replies
jpassaro
by New Contributor
  • 615 Views
  • 1 replies
  • 1 kudos

does databricks respect parallel vacuum setting?

I am trying to run VACUUM on a delta table that i know has millions of obselete files.out of the box, VACUUM runs the deletes in sequence on the driver. that is bad news for me!According to OSS delta docs, the setting spark.databricks.delta.vacuum.pa...

  • 615 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Greetings @jpassaro ,  Thanks for laying out the context and the links. Let me clarify what’s actually happening here and how I’d recommend moving forward. Short answer No. On Databricks Runtime, the spark.databricks.delta.vacuum.parallelDelete.enabl...

  • 1 kudos
oye
by New Contributor II
  • 654 Views
  • 3 replies
  • 0 kudos

Unavailable GPU compute

Hello,I would like to create a ML compute with GPU. I am on GCP europe-west1 and the only available options for me are the G2 family and one instance of the A3 family (a3-highgpu-8g [H100]). I have been trying multiple times at different times but I ...

  • 654 Views
  • 3 replies
  • 0 kudos
Latest Reply
SP_6721
Honored Contributor II
  • 0 kudos

Hi @oye ,You’re hitting a cloud capacity issue, not a Databricks configuration problem. The Databricks GCP GPU docs list A2 and G2 as the supported GPU instance families. A3/H100 is not in the supported list: https://docs.databricks.com/gcp/en/comput...

  • 0 kudos
2 More Replies
Labels