Data Engineering

Forum Posts

Sorted by:

by KVNARK • Honored Contributor II

01-18-2023 10:55:19 PM

3305 Views
1 replies
5 kudos

accessing power bi dataset using MDX query using windows is working but the same not working using python Linux server.

trying to access the SSAS POIWER BI dataset using MDX query from python LInux server. We are hitting roadblock. The existing setup works as expected in windows system due to adodb.dll but unable to connect in Linux. Any help would be much appreciated...

Data Engineering

3305 Views
1 replies
5 kudos

01-18-2023 10:55:19 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-10-2023 8:09:56 AM

5 kudos

@KVNARK . :One potential solution would be to use an open-source MDX library for Python that can connect to SSAS, such as OLAP-XMLA for Python. This library can be used to execute MDX queries against a SSAS server, including Power BI datasets.Here's...

5 kudos

04-10-2023 8:09:56 AM

by Vijay_Bhau • New Contributor II

03-12-2023 11:22:30 PM

4309 Views
4 replies
3 kudos

Hello Team, I am not able to find the bookstore dataset in Databricks. Please guide me to how to download this dataset

Data Engineering

4309 Views
4 replies
3 kudos

03-12-2023 11:22:30 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-17-2023 11:24:27 PM

3 kudos

Hi @Vijay Gadhave Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

3 kudos

03-17-2023 11:24:27 PM

3 More Replies

by arz • New Contributor

09-17-2022 12:59:39 PM

2793 Views
0 replies
0 kudos

PySpark job with joins & write parquet operation fails with FetchFailedException

I'm working on a task where I transform a dataset and re-save it to an S3 bucket. This involves joining the dataset to two others, dropping fields from the initial dataset which overlapped with fields from the other two, hashing certain fields with p...

Data Engineering

2793 Views
0 replies
0 kudos

09-17-2022 12:59:39 PM

by Niha1 • New Contributor III

08-23-2022 10:06:33 PM

1551 Views
0 replies
1 kudos

Not able to install the AIRBNB dataset when trying to run in the notebook-"Scalable ML". I am getting the error as below-:AnalysisException: Path does not exist:

file_path = f"{datasets_dir}/airbnb/sf-listings/sf-listings-2019-03-06-clean.parquet/"2airbnb_df = spark.read.format("parquet").load(file_path)34display(airbnb_df)AnalysisException: Path does not exist: dbfs:/user/nniha9188@gmail.com/dbacademy/machi...

Data Engineering

1551 Views
0 replies
1 kudos

08-23-2022 10:06:33 PM

by tanin • Contributor

02-06-2022 12:49:08 AM

1459 Views
0 replies
1 kudos

Converting from RDD to Dataset, and unit test takes 3x slower. (but prod is faster)

I converted a data job fro RDD to Dataset, and I've found that, in prod, the data job runs faster, which is nice.But unit test runs 3x slower than before.My best guess is that Dataset spends time doing a lot of stuffs like encoding, optimizing, query...

Data Engineering

1459 Views
0 replies
1 kudos

02-06-2022 12:49:08 AM

by FemiAnthony • New Contributor III

11-05-2021 2:45:52 AM

5753 Views
4 replies
3 kudos

Resolved! Location of customer_t1 dataset

Can anyone tell me how I can access the customer_t1 dataset that is referenced in the book "Delta Lake - The Definitive Guide "? I am trying to follow along with one of the examples.

Data Engineering

5753 Views
4 replies
3 kudos

11-05-2021 2:45:52 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

11-05-2021 7:41:44 AM

3 kudos

Some files are visualized here https://github.com/vinijaiswal/delta_time_travel/blob/main/Delta%20Time%20Travel.ipynb but it is quite strange that there is no source in repository. I think only one way is to write to Vini Jaiswal on github.

3 kudos

11-05-2021 7:41:44 AM

3 More Replies

by Geeya • New Contributor II

09-22-2021 12:36:52 PM

2319 Views
1 replies
0 kudos

After several iteration of filter and union, the data is bigger than spark.driver.maxResultSize

The process for me to build model is:filter dataset and split into two datasetsfit model based on two datasets union two datasetsrepeat 1-3 stepsThe problem is that after several iterations, the model fitting time becomes longer dramatically, and the...

Data Engineering

2319 Views
1 replies
0 kudos

09-22-2021 12:36:52 PM

View Replies

Latest Reply

Ryan_Chynoweth
Databricks Employee

09-22-2021 1:11:44 PM

0 kudos

I assume that you are using PySpark to train a model? It sounds like you are collecting data on the driver and likely need to increase the size. Can you share any code?

0 kudos

09-22-2021 1:11:44 PM

by Anonymous • Not applicable

06-10-2021 9:16:57 PM

18578 Views
3 replies
0 kudos

How large should a dataset be so that it’s worth using Spark?

Data Engineering

18578 Views
3 replies
0 kudos

06-10-2021 9:16:57 PM

View Replies

Latest Reply

User16857281974
Databricks Employee

07-30-2021 3:30:20 PM

0 kudos

@Ryan Chynoweth and @Sean Owen are both right, but I have a different perspective on this.Quick side note: you can also configure your cluster to execute with only a driver, and thus reducing the cost to the cheapest single VM available. In the cl...

0 kudos

07-30-2021 3:30:20 PM

2 More Replies

by Anonymous • Not applicable

06-02-2021 5:25:49 PM

1102 Views
0 replies
0 kudos

How large is considered a “large” dataset to put on the driver node?

Data Engineering

1102 Views
0 replies
0 kudos

06-02-2021 5:25:49 PM

by PraveenKumarB • New Contributor

04-24-2019 7:08:28 AM

9500 Views
5 replies
0 kudos

java.io.IOException: No FileSystem for scheme: null

Getting the error when try to load the uploaded file in py notebook.# File location and type file_location = "//FileStore/tables/data/d1.csv" file_type = "csv" # CSV options infer_schema = "true" first_row_is_header = "false" delimiter = ","# The app...

Data Engineering

9500 Views
5 replies
0 kudos

04-24-2019 7:08:28 AM

View Replies

Latest Reply

DivyanshuBhatia
New Contributor II

11-22-2020 6:29:46 AM

0 kudos

@naughtonelad if your issue is solved,please let me know as I am facing the same problem

0 kudos

11-22-2020 6:29:46 AM

4 More Replies