cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Kirankumarbs
by Contributor III
  • 816 Views
  • 4 replies
  • 2 kudos

Resolved! Serverless notebook idle timeout — is it configurable? What exactly am I paying for? Really Ambiguos

Been running notebooks on serverless compute and watching the indicator in the UI. After my last cell finishes, it goes from dark green to this fading green, sits there for maybe 5-10 minutes, then finally goes grey. Pretty sure I'm paying for that e...

  • 816 Views
  • 4 replies
  • 2 kudos
Latest Reply
hali
Visitor
  • 2 kudos

I have the same concern and feedback as OP. I wish there's a way to set auto-terminate after the serverless cluster has been idle for X minutes and not be billed if our users left their notebooks attached to serverless compute and forgot to hit "term...

  • 2 kudos
3 More Replies
MrJava
by New Contributor III
  • 20934 Views
  • 18 replies
  • 13 kudos

How to know, who started a job run?

Hi there!We have different jobs/workflows configured in our Databricks workspace running on AWS and would like to know who actually started the job run? Are they started by a user or a service principle using curl?Currently one can only see, who is t...

  • 20934 Views
  • 18 replies
  • 13 kudos
Latest Reply
saibabu
Visitor
  • 13 kudos

Any update on this feature ?

  • 13 kudos
17 More Replies
GJ2
by New Contributor II
  • 20448 Views
  • 15 replies
  • 2 kudos

Install the ODBC Driver 17 for SQL Server

Hi,I am not a Data Engineer, I want to connect to ssas. It looks like it can be connected through pyodbc. however looks like  I need to install "ODBC Driver 17 for SQL Server" using the following command. How do i install the driver on the cluster an...

GJ2_1-1739798450883.png
  • 20448 Views
  • 15 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 2 kudos

As SQL Server is included in the Lakehouse federation driver, it is built in databricks. Install only in case you need a different version - the built-in one is not working

  • 2 kudos
14 More Replies
maze2498
by Visitor
  • 57 Views
  • 1 replies
  • 0 kudos

Issue Genie Benchmark: Different responses in UI and Benchmark

Hello, I am trying to add a benchmark dataset for my genie space.When I ask the a question on the Genie space UI directly, I get the right output. However when I add the same question in the genie benchmark, the result is quite bad and the sql it use...

  • 57 Views
  • 1 replies
  • 0 kudos
Latest Reply
emma_s
Databricks Employee
  • 0 kudos

Hi, when you say the sql it generates is quite bad and missing, do you mean when you run the benchmark? The benchmark purposefully doesn't have any conversation history unlike the Genie Space. So sometimes the results can vary. Ie if you've asked a l...

  • 0 kudos
Subhas1729
by New Contributor
  • 61 Views
  • 1 replies
  • 1 kudos

how to access the catalog and schema from my program

Hi     I am using the SDP editor. I have set the catalog and schema in setting. how to access those variables values in my program. I am doing as follows:  catalog = spark.conf.get("catalog") and it is similar for schema. When I try to use those vari...

  • 61 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @Subhas1729 ,Default location for data assets section of the pipeline configuration UI sets the default catalog and schema for a pipeline. This default catalog and schema are used for all dataset definitions and table reads, unless overridden with...

  • 1 kudos
Dhruv-22
by Contributor III
  • 71 Views
  • 1 replies
  • 0 kudos

ProfilingError: SPARK_ERROR. Spark encountered an error while refreshing metrics.

I've a table with the following profiling settings{ "status": "MONITOR_STATUS_ACTIVE", "profile_metrics_table_name": "edw_prd_aen.silver.fct_retail_permit_profile_metrics", "drift_metrics_table_name": "edw_prd_aen.silver.fct_retail_permit_drift_me...

Dhruv22_0-1777435414693.png
  • 71 Views
  • 1 replies
  • 0 kudos
Latest Reply
stbjelcevic
Databricks Employee
  • 0 kudos

Hi @Dhruv-22 , This is a known limitation. Data Profiling monitors don't auto-adapt when columns are added to the source table, the fix is to delete and recreate the monitor. When the monitor is created, the profiling job captures the source schema a...

  • 0 kudos
Lewis
by New Contributor
  • 109 Views
  • 2 replies
  • 2 kudos

Resolved! Server Error: Invalid Request URL

Hello,Had this pop up a few times this week when trying to run notebooks from within another notebook. It has been quite inconsistent, as some of the referenced notebooks will work but then for one this will pop up (and the one that doesn't work vari...

Lewis_0-1777454480192.png
  • 109 Views
  • 2 replies
  • 2 kudos
Latest Reply
Lewis
New Contributor
  • 2 kudos

Thank you

  • 2 kudos
1 More Replies
MikeGo
by Valued Contributor
  • 113 Views
  • 1 replies
  • 1 kudos

Resolved! Genie space model selection

Hi team, is it possible to specify models for genie space? Thanks.

  • 113 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi  @MikeGo ,Short answer: no, you cannot select the LLM for a native Genie Space - the model is managed entirely by Databricks.Genie uses a compound AI system to interpret business questions and generate answers. Instead of using a single large lang...

  • 1 kudos
nakaxa
by New Contributor
  • 46153 Views
  • 5 replies
  • 1 kudos

Fastest way to write a Spark Dataframe to a delta table

I read a huge array with several columns into memory, then I convert it into a spark dataframe,  when I want to write to a delta table it using the following command it takes forever (I have a driver with large memory and 32 workers) : df_exp.write.m...

  • 46153 Views
  • 5 replies
  • 1 kudos
Latest Reply
ShawnRR
New Contributor II
  • 1 kudos

Out of interest, Did you try seeing what happens if you break the steps down into something like...df.write() .format("parquet") .mode(SaveMode.Overwrite) .save(parquetPath);Followed by....spark.sql("CREATE TABLE my_delta_table USING DELTA LOCATION '...

  • 1 kudos
4 More Replies
JUMAN4422
by Databricks Partner
  • 108 Views
  • 1 replies
  • 0 kudos

Resolved! ABAC Policies Not Working on Metric Views

I wanted to check if ABAC (Attribute-Based Access Control) policies can be applied to metric views in Databricks.I have successfully applied ABAC policies on a fact table, and they are working as expected. However, when I query a metric view that use...

  • 108 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @JUMAN4422 ,Yes, this is a limitation. You cannot apply ABAC policies directly to views. Since metric views are a special type of view (CREATE VIEW ... WITH METRICS), so this limitation applies to them as well.ABAC requirements, quotas, and limita...

  • 0 kudos
kcyugesh
by New Contributor II
  • 101 Views
  • 2 replies
  • 0 kudos

Unity Catalog storage credential fails although same Access Connector works in another credential

  In Azure Databricks Unity Catalog, I have two storage credentials that use the same connector_id / Azure Databricks Access Connector.One credential works and can access ADLS Gen2 successfully, but the other fails with: Failed to access cloud storag...

  • 101 Views
  • 2 replies
  • 0 kudos
Latest Reply
zoe_unifeye
Databricks Partner
  • 0 kudos

Hi @kcyugesh How are you getting on so far?It might also be worth checking the privileges associated with each credential to see if they differ.And secondly check the credential type on the credential, as a manaded identity in comparison to a service...

  • 0 kudos
1 More Replies
HariharaSam
by Databricks Partner
  • 28060 Views
  • 5 replies
  • 2 kudos

Parallel Processing of Databricks Notebook

I have a scenario where I need to run same databricks notebook multiple times in parallel.What is the best approach to do this ?

  • 28060 Views
  • 5 replies
  • 2 kudos
Latest Reply
Akshay_Petkar
Valued Contributor
  • 2 kudos

Hi,You can use Databricks Jobs to run the same notebook multiple times in parallel. For this, you can create a Databricks Job for each activity, which allows you to execute the notebook concurrently with different parameters as needed.You can refer t...

  • 2 kudos
4 More Replies
faruko
by New Contributor II
  • 219 Views
  • 5 replies
  • 5 kudos

Resolved! Best practices for initial large-scale ingestion from on‑premises Oracle to Databricks

Hello everyone,I am responsible for designing and implementing a Lakehouse architecture in an industrial company.I am currently facing some challenges regarding the initial ingestion of data from our on‑premise Oracle database into Databricks.The dat...

  • 219 Views
  • 5 replies
  • 5 kudos
Latest Reply
amirabedhiafi
New Contributor II
  • 5 kudos

Hi @faruko  !My idea is to treat the initial load as a controlled batch backfill then start the CDC pipeline afterwards from a clear cutoff point.You define a fixed cutoff timestamp or Oracle SCN for the initial snapshot and later load history in sma...

  • 5 kudos
4 More Replies
AlexSantiago
by New Contributor II
  • 17340 Views
  • 24 replies
  • 4 kudos

spotify API get token - raw_input was called, but this frontend does not support input requests.

hello everyone, I'm trying use spotify's api to analyse my music data, but i'm receiving a error during authentication, specifically when I try get the token, above my code.Is it a databricks bug?pip install spotipyfrom spotipy.oauth2 import SpotifyO...

  • 17340 Views
  • 24 replies
  • 4 kudos
Latest Reply
armorycrate
New Contributor
  • 4 kudos

Working with APIs like Spotify can sometimes lead to unexpected errors, especially when authentication or token requests are not Armoury Crate - Download handled correctly. Issues such as often relate to how the frontend interacts  with backend servi...

  • 4 kudos
23 More Replies
rohit8491
by New Contributor III
  • 8056 Views
  • 4 replies
  • 8 kudos

Azure Databricks Connectivity with Power BI Cloud - Firewall Whitelisting

Hi Support TeamWe want to connect to tables in Azure Databricks via Power BI. We are able to connect this via Power BI Desktop but when we try to Publish the same, we can see the dataset associated does not refresh and throws error from Powerbi.comIt...

  • 8056 Views
  • 4 replies
  • 8 kudos
Latest Reply
LokeshChikuru
Databricks Partner
  • 8 kudos

WHAT IS THE FIX FOR THIS ? IS THIS RESOLVED FOR YOU ?

  • 8 kudos
3 More Replies
Labels