cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Kayla
by Valued Contributor II
  • 31 Views
  • 1 replies
  • 0 kudos

JSON Medallion Best Practices

I'm looking at ingesting JSON files from an API, pulling a list of orders. Each JSON file has header information and then a nested array of items - I want to flatten this into a table with 1 row/item and the header repeated for every item.What is the...

  • 31 Views
  • 1 replies
  • 0 kudos
Latest Reply
Coffee77
Contributor III
  • 0 kudos

I would need to know a little more about your scenario but it makes me remember a similar case I faced. My approach was to use silver layer to create a delta table with enforced schema, standard field names and types, etc. to perform typical actions ...

  • 0 kudos
leenack
by New Contributor
  • 270 Views
  • 8 replies
  • 5 kudos

Resolved! No rows returned when calling Databricks procedure via .NET API and Simba ODBC driver

I created a simple Databricks procedure that should return a single value."SELECT 1 AS result;"When I call this procedure from my .NET API using ExecuteReader, ExecuteAdapter, or ExecuteScalar, the call completes without any errors, but no rows are r...

  • 270 Views
  • 8 replies
  • 5 kudos
Latest Reply
leenack
New Contributor
  • 5 kudos

Thank you  @mark_ott  and @Coffee77  for your help .This has saved me a great deal of time. I now understand that I need to use procedures, functions, or direct SQL queries as a workaround to retrieve data in the .NET API. I will also keep an eye out...

  • 5 kudos
7 More Replies
hectorfoster
by Visitor
  • 20 Views
  • 1 replies
  • 0 kudos

Passed the DBX Associate Engineer Exam. However, did not receive Digital Certificate

Department of DBX Certification,I passed the Databricks Data Engineer Associate Certified exam, but it has been more than 48 hours since I received the certificate.Could you please let me know when I can expect to obtain the certificate?#Certificatio...

  • 20 Views
  • 1 replies
  • 0 kudos
Latest Reply
Sat_8
New Contributor III
  • 0 kudos

Congratulations @hectorfoster  on achieving your Databricks Certified Associate certification!The official email from Databricks is generally sent within 48 hours. via credentials.databricks.com or raise a ticket with the Databricks Help Center Datab...

  • 0 kudos
SparkPractice
by New Contributor
  • 28 Views
  • 1 replies
  • 1 kudos

Bucketing in DataBricks free edition please help me with the ERROR

Hello guys i am trying to implement Bucketing in DataBricks free edition this is the code and error employee_df.write.format("csv")\           .option("header","true")\           .mode("overwrite")\           .bucketBy(3,"id")\           .option("pat...

  • 28 Views
  • 1 replies
  • 1 kudos
Latest Reply
Coffee77
Contributor III
  • 1 kudos

There are a lot of ways to work with files in Databricks, so take a look here: https://docs.databricks.com/aws/en/files/ And then reference the path in the correct way. 

  • 1 kudos
Garrus990
by New Contributor II
  • 1602 Views
  • 3 replies
  • 1 kudos

How to run a python task that uses click for CLI operations

Hey,in my application I am using click to facilitate CLI operations. It works locally, in notebooks, when scripts are run locally, but it fails in Databricks. I defined a task that, as an entrypoint, accepts the file where the click-decorated functio...

  • 1602 Views
  • 3 replies
  • 1 kudos
Latest Reply
robbe
Contributor
  • 1 kudos

Hi @Garrus990 @Rodra have you guys found a solution for this issue? I'm also having the same problem on Sevrerless compute v4.Interestingly enough it seems working on job cluster with runtime 16.4 LTS .

  • 1 kudos
2 More Replies
AlexSantiago
by New Contributor II
  • 14587 Views
  • 21 replies
  • 4 kudos

spotify API get token - raw_input was called, but this frontend does not support input requests.

hello everyone, I'm trying use spotify's api to analyse my music data, but i'm receiving a error during authentication, specifically when I try get the token, above my code.Is it a databricks bug?pip install spotipyfrom spotipy.oauth2 import SpotifyO...

  • 14587 Views
  • 21 replies
  • 4 kudos
Latest Reply
hectorfoster
  • 4 kudos

It appears that there is a problem with user input when using Spotify's API for authentication. The error "raw_input was called, but this frontend does not support input requests" frequently means that interactive input is not supported by the enviro...

  • 4 kudos
20 More Replies
Sahil0007
by New Contributor III
  • 32 Views
  • 1 replies
  • 0 kudos

Issue while reading excel file in qatar region

I have installed excel library version - com.crealytics:spark-excel_2.12:3.5.1_0.20.4.When I am trying to read it using the below code giving following error - code : df = spark.read.format("com.crealytics.spark.excel") \    .option("header", "true")...

  • 32 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Hello @Sahil0007  Thanks for sharing the code and error. This specific error means Spark can’t find the Excel data source on your cluster. What the error means The message “[DATA_SOURCE_NOT_FOUND] Failed to find the data source: com.crealytics.spark....

  • 0 kudos
africke
by New Contributor
  • 48 Views
  • 1 replies
  • 0 kudos

Cannot view nested MLflow experiment runs without changing URL

Hello,I've recently been testing out Databricks experiments for a project of mine. I wanted to nest runs, and then see these runs grouped by their parent in the experiments UI. For the longest time, I couldn't figure out how to do this. I was seeing ...

africke_0-1763067856720.png africke_1-1763068038451.png
  • 48 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Greetings @africke  Thanks for the detailed write-up — this is a common point of confusion when moving between local MLflow and the Databricks-managed UI. How to get back to the Runs view (grouped/nested runs) You can always return to the experiment’...

  • 0 kudos
EAnthemNHC1
by New Contributor III
  • 38 Views
  • 2 replies
  • 0 kudos

Time Travel Error when selecting from materialized view (Azure Databricks)

Hey - running into an error this morning that was brought to my attention via failed refreshes from PowerBI. We have a materialized view that, when queried with the standard pattern of 'select col1 from {schema}.table_name', returns an error of 'Cann...

  • 38 Views
  • 2 replies
  • 0 kudos
Latest Reply
nayan_wylde
Esteemed Contributor
  • 0 kudos

DESCRIBE HISTORY catalog.schema.table_name;Check the earliest available version. If the version mentioned in the error is older than what’s retained, that’s the issue.Also, inspect the materialized view’s backing pipeline in Catalog Explorer → Refres...

  • 0 kudos
1 More Replies
shashankB
by New Contributor III
  • 48 Views
  • 2 replies
  • 0 kudos

Lakebridge analyzer not able to determine DDL.

 Databricks analyzer does not shows any DDL statement count, I've also tested with just a simple SELECT * query (SELECT *  FROM SCHEMA_NAME.TABLE_NAME;) . Is there any solution for this ?My target was to get a detailed analysis on SnowSQL code. Any h...

  • 48 Views
  • 2 replies
  • 0 kudos
Latest Reply
saurabh18cs
Honored Contributor II
  • 0 kudos

Hi @shashankB  select is considered as DML and not DDL

  • 0 kudos
1 More Replies
zoe_unifeye
by New Contributor II
  • 51 Views
  • 1 replies
  • 1 kudos

Building a Theoretical Solar Flare Intelligence System for the Databricks Free Edition Hackathon

I recently built a Theoretical Solar Flare Grid Impact Intelligence System for the Databricks Free Edition Hackathon 2025, and I wanted to share my journey building an end-to-end data engineering and ML solution on Databricks Free Edition.Finding the...

  • 51 Views
  • 1 replies
  • 1 kudos
Latest Reply
Raman_Unifeye
New Contributor III
  • 1 kudos

Fabulous submission @zoe_unifeye and good luck with hackathon.

  • 1 kudos
Nidhig
by Contributor
  • 34 Views
  • 2 replies
  • 1 kudos

Databricks One- Get option to see objects list

Hi,While working  on Databricks one, I feel it would be very helpful to have an option that allows users to easily view the list of tables within a schema or database directly from the UI. This would improve navigation and make it easier to explore a...

  • 34 Views
  • 2 replies
  • 1 kudos
Latest Reply
Raman_Unifeye
New Contributor III
  • 1 kudos

Short answer is NO. and I suppose that is not the purpose and right usage of Databricks One as It is meant to be the interface for the Business users, rather traditional data analysts. Obviously, through Genie you could ask 'Explain Data' to provide ...

  • 1 kudos
1 More Replies
Y_WANG
by New Contributor II
  • 127 Views
  • 2 replies
  • 0 kudos

Resolved! Want to use DataFrame equality functions but also Numpy >= 2.0

In my team, we has a lot of Data science workflow using Spark and Pandas. In order to rassure the stability of workflows, we need to implement the unit test. Recently, I found out the DataFrame equality test functions introduced in Spark 3.5 which se...

  • 127 Views
  • 2 replies
  • 0 kudos
Latest Reply
ManojkMohan
Honored Contributor II
  • 0 kudos

@Y_WANG  The root cause of the AttributeError you face when importing assertDataFrameEqual from pyspark.testing in Spark 3.5 is due to Spark's code using the deprecated np.NaN attribute, which was removed in NumPy 2.0 (replaced by np.nan). This break...

  • 0 kudos
1 More Replies
pooja_bhumandla
by New Contributor III
  • 26 Views
  • 1 replies
  • 0 kudos

Best Practice for Updating Data Skipping Statistics for Additional Columns

Hi Community,I have a scenario where I’ve already calculated delta statistics for the first 32 columns after enabling the dataskipping property. Now, I need to include 10 more frequently used columns that were not part of the original 32.Goal:I want ...

  • 26 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @pooja_bhumandla ,Updating any of two below options does not automatically recompute statistics for existing data. Rather, it impacts the behavior of future statistics collection when adding or updating data in the table.- delta.dataSkippingNumInd...

  • 0 kudos
Elm8r
by New Contributor
  • 31 Views
  • 1 replies
  • 0 kudos

Databricks connect use local virtual enviroment

I have a simple python script in my local development enviroment and a related uv virtual env. I am trying to run the script on a databricks cluster using my venv, but even if i select it in the Python Environment section it is not actually using it,...

  • 31 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @Elm8r,Are you using vscode? 

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels