Data Engineering

Forum Posts

Sorted by:

by vijaykumarbotla • New Contributor III

05-29-2023 9:15:23 AM

5992 Views
5 replies
1 kudos

Resolved! Getting error : Analysis Exception : olumn Is There a PO#17748 are ambiguous. It's probably because you joined several Datasets together, and some of these Datasets are the same. This column points to one of the Datasets but Spark.

AnalysisException: Column Is There a PO#17748 are ambiguous. It's probably because you joined several Datasets together, and some of these Datasets are the same. This column points to one of the Datasets but Spark is unable to figure out which one. ...

Data Engineering

5992 Views
5 replies
1 kudos

05-29-2023 9:15:23 AM

View Replies

Latest Reply

vijaykumarbotla
New Contributor III

05-31-2023 6:56:24 AM

1 kudos

Hi All,the solution for this problem is very strange.this has caused due to the version of the Databricks runtime.We are using Runtime version 7.0 with Apache Spark 3.0.0 version.In PRD we are using Runtime version 11.3LTS with Apache Spark 3.3.0 ver...

1 kudos

05-31-2023 6:56:24 AM

4 More Replies

by StephanieAlba • Databricks Employee

03-22-2023 7:30:08 AM

12662 Views
4 replies
2 kudos

Resolved! How do I download and unzip datasets from Kaggle into DBFS?

Data Engineering

12662 Views
4 replies
2 kudos

03-22-2023 7:30:08 AM

View Replies

Latest Reply

Debayan
Databricks Employee

03-22-2023 10:43:00 PM

2 kudos

Hi, You can refer to https://docs.databricks.com/files/unzip-files.html. You can curl the file you want and then it can be unzipped as mentioned in the doc. Please let us know if this helps.Also, please tag @Debayan with your next update which will n...

2 kudos

03-22-2023 10:43:00 PM

3 More Replies

by quakenbush • Contributor

01-18-2023 7:16:21 AM

5646 Views
3 replies
4 kudos

Resolved! Does Databricks offer something like Oracle's dblink?

I am aware, I can load anything into a DataFrame using JDBC, that works well from Oracle sources. Is there an equivalent in Spark SQL, so I can combine datasets as well?Basically something like so - you get the idea...select lt.field1, rt.fie...

Data Engineering

5646 Views
3 replies
4 kudos

01-18-2023 7:16:21 AM

View Replies

Latest Reply

quakenbush
Contributor

01-19-2023 12:35:48 AM

4 kudos

Thanks everyone for helping.

4 kudos

01-19-2023 12:35:48 AM

2 More Replies

by Kavin • New Contributor II

11-08-2022 6:07:34 PM

2341 Views
1 replies
2 kudos

Issue converting the datasets into JSON

Im a newbie to Databricks, I need to convert the data sets into JSON. i tried bth FOR JSON AUTO AND FOR JSON PATH, However im getting an issue - [PARSE_SYNTAX_ERROR] Syntax error at or near 'json'line My Query works fine without FOR JSON AUTO AND FOR...

Data Engineering

2341 Views
1 replies
2 kudos

11-08-2022 6:07:34 PM

View Replies

Latest Reply

Debayan
Databricks Employee

11-08-2022 11:03:28 PM

2 kudos

Hi @Kavin Natarajan , Could you please go through https://www.tutorialkart.com/apache-spark/spark-write-dataset-to-json-file-example/ , looks like the steps are okay.

2 kudos

11-08-2022 11:03:28 PM

by Geeya • New Contributor II

09-22-2021 12:36:52 PM

2162 Views
1 replies
0 kudos

After several iteration of filter and union, the data is bigger than spark.driver.maxResultSize

The process for me to build model is:filter dataset and split into two datasetsfit model based on two datasets union two datasetsrepeat 1-3 stepsThe problem is that after several iterations, the model fitting time becomes longer dramatically, and the...

Data Engineering

2162 Views
1 replies
0 kudos

09-22-2021 12:36:52 PM

View Replies

Latest Reply

Ryan_Chynoweth
Esteemed Contributor

09-22-2021 1:11:44 PM

0 kudos

I assume that you are using PySpark to train a model? It sounds like you are collecting data on the driver and likely need to increase the size. Can you share any code?

0 kudos

09-22-2021 1:11:44 PM

Databricks Community

Resolved! Getting error : Analysis Exception : olumn Is There a PO#17748 are ambiguous. It's probably because you joined several Datasets together, and some of these Datasets are the same. This column points to one of the Datasets but Spark.

Resolved! How do I download and unzip datasets from Kaggle into DBFS?

Resolved! Does Databricks offer something like Oracle's dblink?

Issue converting the datasets into JSON

After several iteration of filter and union, the data is bigger than spark.driver.maxResultSize