Data Engineering

Forum Posts

Sorted by:

by Pratikmsbsvm • Contributor

18m ago

4 Views
0 replies
0 kudos

Data Pipeline for Bringing Data from Oracle Fusion to Azure Databricks

I am trying to bring Oracle Fusion (SCM, HCM, Finance) Data and push to ADLS Gen2. Databricks used for Data Transformation and Powerbi used for Reports Visualization.I have 3 Option.Option 1 :Option 2 : Option 3May someone please help me which is bes...

Data Engineering

4 Views
0 replies
0 kudos

18m ago

by Kayla • Valued Contributor II

yesterday

61 Views
1 replies
0 kudos

JSON Medallion Best Practices

I'm looking at ingesting JSON files from an API, pulling a list of orders. Each JSON file has header information and then a nested array of items - I want to flatten this into a table with 1 row/item and the header repeated for every item.What is the...

Data Engineering

61 Views
1 replies
0 kudos

yesterday

View Replies

Latest Reply

Coffee77
Contributor III

yesterday

0 kudos

I would need to know a little more about your scenario but it makes me remember a similar case I faced. My approach was to use silver layer to create a delta table with enforced schema, standard field names and types, etc. to perform typical actions ...

0 kudos

yesterday

by leenack • New Contributor

Tuesday

310 Views
8 replies
5 kudos

Resolved! No rows returned when calling Databricks procedure via .NET API and Simba ODBC driver

I created a simple Databricks procedure that should return a single value."SELECT 1 AS result;"When I call this procedure from my .NET API using ExecuteReader, ExecuteAdapter, or ExecuteScalar, the call completes without any errors, but no rows are r...

Data Engineering

310 Views
8 replies
5 kudos

Tuesday

View Replies

Latest Reply

leenack
New Contributor

yesterday

5 kudos

Thank you @mark_ott and @Coffee77 for your help .This has saved me a great deal of time. I now understand that I need to use procedures, functions, or direct SQL queries as a workaround to retrieve data in the .NET API. I will also keep an eye out...

5 kudos

yesterday

7 More Replies

by hectorfoster • New Contributor

yesterday

45 Views
1 replies
0 kudos

Passed the DBX Associate Engineer Exam. However, did not receive Digital Certificate

Department of DBX Certification,I passed the Databricks Data Engineer Associate Certified exam, but it has been more than 48 hours since I received the certificate.Could you please let me know when I can expect to obtain the certificate?#Certificatio...

Data Engineering

45 Views
1 replies
0 kudos

yesterday

View Replies

Latest Reply

Sat_8
New Contributor III

yesterday

0 kudos

Congratulations @hectorfoster on achieving your Databricks Certified Associate certification!The official email from Databricks is generally sent within 48 hours. via credentials.databricks.com or raise a ticket with the Databricks Help Center Datab...

0 kudos

yesterday

by SparkPractice • New Contributor

yesterday

57 Views
1 replies
1 kudos

Bucketing in DataBricks free edition please help me with the ERROR

Hello guys i am trying to implement Bucketing in DataBricks free edition this is the code and error employee_df.write.format("csv")\ .option("header","true")\ .mode("overwrite")\ .bucketBy(3,"id")\ .option("pat...

Data Engineering

57 Views
1 replies
1 kudos

yesterday

View Replies

Latest Reply

Coffee77
Contributor III

yesterday

1 kudos

There are a lot of ways to work with files in Databricks, so take a look here: https://docs.databricks.com/aws/en/files/ And then reference the path in the correct way.

1 kudos

yesterday

by Garrus990 • New Contributor II

10-16-2024 5:44:18 AM

1614 Views
3 replies
1 kudos

How to run a python task that uses click for CLI operations

Hey,in my application I am using click to facilitate CLI operations. It works locally, in notebooks, when scripts are run locally, but it fails in Databricks. I defined a task that, as an entrypoint, accepts the file where the click-decorated functio...

Data Engineering

1614 Views
3 replies
1 kudos

10-16-2024 5:44:18 AM

View Replies

Latest Reply

robbe
Contributor

yesterday

1 kudos

Hi @Garrus990 @Rodra have you guys found a solution for this issue? I'm also having the same problem on Sevrerless compute v4.Interestingly enough it seems working on job cluster with runtime 16.4 LTS .

1 kudos

yesterday

2 More Replies

by AlexSantiago • New Contributor II

09-10-2022 11:40:01 PM

14610 Views
21 replies
4 kudos

spotify API get token - raw_input was called, but this frontend does not support input requests.

hello everyone, I'm trying use spotify's api to analyse my music data, but i'm receiving a error during authentication, specifically when I try get the token, above my code.Is it a databricks bug?pip install spotipyfrom spotipy.oauth2 import SpotifyO...

Data Engineering

14610 Views
21 replies
4 kudos

09-10-2022 11:40:01 PM

View Replies

Latest Reply

hectorfoster
New Contributor

yesterday

4 kudos

It appears that there is a problem with user input when using Spotify's API for authentication. The error "raw_input was called, but this frontend does not support input requests" frequently means that interactive input is not supported by the enviro...

4 kudos

yesterday

20 More Replies

by Sahil0007 • New Contributor III

Thursday

44 Views
1 replies
0 kudos

Issue while reading excel file in qatar region

I have installed excel library version - com.crealytics:spark-excel_2.12:3.5.1_0.20.4.When I am trying to read it using the below code giving following error - code : df = spark.read.format("com.crealytics.spark.excel") \ .option("header", "true")...

Data Engineering

44 Views
1 replies
0 kudos

Thursday

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

Friday

0 kudos

Hello @Sahil0007 Thanks for sharing the code and error. This specific error means Spark can’t find the Excel data source on your cluster. What the error means The message “[DATA_SOURCE_NOT_FOUND] Failed to find the data source: com.crealytics.spark....

0 kudos

Friday

by africke • New Contributor

Thursday

60 Views
1 replies
0 kudos

Cannot view nested MLflow experiment runs without changing URL

Hello,I've recently been testing out Databricks experiments for a project of mine. I wanted to nest runs, and then see these runs grouped by their parent in the experiments UI. For the longest time, I couldn't figure out how to do this. I was seeing ...

Data Engineering

60 Views
1 replies
0 kudos

Thursday

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

Friday

0 kudos

Greetings @africke Thanks for the detailed write-up — this is a common point of confusion when moving between local MLflow and the Databricks-managed UI. How to get back to the Runs view (grouped/nested runs) You can always return to the experiment’...

0 kudos

Friday

by EAnthemNHC1 • New Contributor III

Friday

48 Views
2 replies
0 kudos

Time Travel Error when selecting from materialized view (Azure Databricks)

Hey - running into an error this morning that was brought to my attention via failed refreshes from PowerBI. We have a materialized view that, when queried with the standard pattern of 'select col1 from {schema}.table_name', returns an error of 'Cann...

Data Engineering

48 Views
2 replies
0 kudos

Friday

View Replies

Latest Reply

nayan_wylde
Esteemed Contributor

Friday

0 kudos

DESCRIBE HISTORY catalog.schema.table_name;Check the earliest available version. If the version mentioned in the error is older than what’s retained, that’s the issue.Also, inspect the materialized view’s backing pipeline in Catalog Explorer → Refres...

0 kudos

Friday

1 More Replies

by shashankB • New Contributor III

Friday

54 Views
2 replies
0 kudos

Lakebridge analyzer not able to determine DDL.

Databricks analyzer does not shows any DDL statement count, I've also tested with just a simple SELECT * query (SELECT * FROM SCHEMA_NAME.TABLE_NAME;) . Is there any solution for this ?My target was to get a detailed analysis on SnowSQL code. Any h...

Data Engineering

54 Views
2 replies
0 kudos

Friday

View Replies

Latest Reply

saurabh18cs
Honored Contributor II

Friday

0 kudos

Hi @shashankB select is considered as DML and not DDL

0 kudos

Friday

1 More Replies

by zoe_unifeye • New Contributor II

Thursday

63 Views
1 replies
1 kudos

Building a Theoretical Solar Flare Intelligence System for the Databricks Free Edition Hackathon

I recently built a Theoretical Solar Flare Grid Impact Intelligence System for the Databricks Free Edition Hackathon 2025, and I wanted to share my journey building an end-to-end data engineering and ML solution on Databricks Free Edition.Finding the...

Data Engineering

63 Views
1 replies
1 kudos

Thursday

View Replies

Latest Reply

Raman_Unifeye
New Contributor III

Friday

1 kudos

Fabulous submission @zoe_unifeye and good luck with hackathon.

1 kudos

Friday

by Nidhig • Contributor

Friday

40 Views
2 replies
1 kudos

Databricks One- Get option to see objects list

Hi,While working on Databricks one, I feel it would be very helpful to have an option that allows users to easily view the list of tables within a schema or database directly from the UI. This would improve navigation and make it easier to explore a...

Data Engineering

40 Views
2 replies
1 kudos

Friday

View Replies

Latest Reply

Raman_Unifeye
New Contributor III

Friday

1 kudos

Short answer is NO. and I suppose that is not the purpose and right usage of Databricks One as It is meant to be the interface for the Business users, rather traditional data analysts. Obviously, through Genie you could ask 'Explain Data' to provide ...

1 kudos

Friday

1 More Replies

by Y_WANG • New Contributor II

a week ago

136 Views
2 replies
0 kudos

Resolved! Want to use DataFrame equality functions but also Numpy >= 2.0

In my team, we has a lot of Data science workflow using Spark and Pandas. In order to rassure the stability of workflows, we need to implement the unit test. Recently, I found out the DataFrame equality test functions introduced in Spark 3.5 which se...

Data Engineering

136 Views
2 replies
0 kudos

a week ago

View Replies

Latest Reply

ManojkMohan
Honored Contributor II

a week ago

0 kudos

@Y_WANG The root cause of the AttributeError you face when importing assertDataFrameEqual from pyspark.testing in Spark 3.5 is due to Spark's code using the deprecated np.NaN attribute, which was removed in NumPy 2.0 (replaced by np.nan). This break...

0 kudos

a week ago

1 More Replies

by pooja_bhumandla • New Contributor III

Friday

39 Views
1 replies
0 kudos

Best Practice for Updating Data Skipping Statistics for Additional Columns

Hi Community,I have a scenario where I’ve already calculated delta statistics for the first 32 columns after enabling the dataskipping property. Now, I need to include 10 more frequently used columns that were not part of the original 32.Goal:I want ...

Data Engineering

39 Views
1 replies
0 kudos

Friday

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

Friday

0 kudos

Hi @pooja_bhumandla ,Updating any of two below options does not automatically recompute statistics for existing data. Rather, it impacts the behavior of future statistics collection when adding or updating data in the table.- delta.dataSkippingNumInd...

0 kudos

Friday

Databricks Community

Forum Posts

Data Pipeline for Bringing Data from Oracle Fusion to Azure Databricks

JSON Medallion Best Practices

Resolved! No rows returned when calling Databricks procedure via .NET API and Simba ODBC driver

Passed the DBX Associate Engineer Exam. However, did not receive Digital Certificate

Bucketing in DataBricks free edition please help me with the ERROR

How to run a python task that uses click for CLI operations

spotify API get token - raw_input was called, but this frontend does not support input requests.

Issue while reading excel file in qatar region

Cannot view nested MLflow experiment runs without changing URL

Time Travel Error when selecting from materialized view (Azure Databricks)

Lakebridge analyzer not able to determine DDL.

Building a Theoretical Solar Flare Intelligence System for the Databricks Free Edition Hackathon

Databricks One- Get option to see objects list

Resolved! Want to use DataFrame equality functions but also Numpy >= 2.0

Best Practice for Updating Data Skipping Statistics for Additional Columns

Join Us as a Local Community Builder!

No rows returned when calling Databricks procedure...

Trouble Enabling File Events For An External Locat...

Want to use DataFrame equality functions but also ...

Loading CSV from private S3 bucket

DATA_SOURCE_NOT_FOUND Error with MongoDB (Suggesti...