Data Engineering

Forum Posts

Sorted by:

by Erik_L • Contributor II

04-20-2023 4:22:59 PM

2685 Views
3 replies
1 kudos

Resolved! How to keep data in time-based localized clusters after joining?

I have a bunch of data frames from different data sources. They are all time series data in order of a column timestamp, which is an int32 Unix timestamp. I can join them together by this and another column join_idx which is basically an integer inde...

Data Engineering

2685 Views
3 replies
1 kudos

04-20-2023 4:22:59 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-20-2023 7:16:25 PM

1 kudos

@Erik Louie :If the data frames have different time zones, you can use Databricks' timezone conversion function to convert them to a common time zone. You can use the from_utc_timestamp or to_utc_timestampfunction to convert the timestamp column to ...

1 kudos

04-20-2023 7:16:25 PM

2 More Replies

by rubenteixeira • New Contributor III

01-09-2023 7:16:03 AM

3944 Views
2 replies
0 kudos

Can't parallelize model training with sc.parallelize, even tough I can run the same code without parallelizing

I'm training a NeuralProphet for a time series forecasting problem. I'm trying to parallelize my training, but this error is appearingThe folder lightning_logs has a hparams.yaml but it's empty. Is this related to permissions on the cluster? Thanks i...

Data Engineering

3944 Views
2 replies
0 kudos

01-09-2023 7:16:03 AM

View Replies

Latest Reply

Debayan
Databricks Employee

01-09-2023 2:07:40 PM

0 kudos

Hi,Please let us know if this was checked already:

0 kudos

01-09-2023 2:07:40 PM

1 More Replies

by Erik • Valued Contributor III

12-16-2021 11:23:08 AM

14575 Views
12 replies
8 kudos

Grafana + databricks = True?

We have some timeseries in databricks, and we are reading them into powerbi through sql compute endpoints. For timeseries powerbi is ... not optimal. Earlier I have used grafana with various backends, and quite like it, but I cant find any way to con...

Data Engineering

14575 Views
12 replies
8 kudos

12-16-2021 11:23:08 AM

View Replies

Latest Reply

cold_river_22
New Contributor II

12-02-2022 10:42:20 AM

8 kudos

There is now an Open-Source Grafana Databricks backend plugin available.https://github.com/mullerpeter/databricks-grafana

8 kudos

12-02-2022 10:42:20 AM

11 More Replies

by RRO • Contributor

03-31-2022 3:12:14 AM

33276 Views
6 replies
7 kudos

Resolved! Performance for pyspark dataframe is very slow after using a @pandas_udf

Hello,I am currently working on a time series forecasting with FBProphet. Since I have data with many time series groups (~3000) I use a @pandas_udf to parallelize the training. @pandas_udf(schema, PandasUDFType.GROUPED_MAP) def forecast_netprofit(pr...

Data Engineering

33276 Views
6 replies
7 kudos

03-31-2022 3:12:14 AM

View Replies

Latest Reply

RRO
Contributor

04-12-2022 8:01:24 AM

7 kudos

Thank you for the answers. Unfortunately this did not solve the performance issue.What I did now is I saved the results into a table:results.write.mode("overwrite").saveAsTable("db.results") This is probably not the best solution but after I do that ...

7 kudos

04-12-2022 8:01:24 AM

5 More Replies

Databricks Community

Resolved! How to keep data in time-based localized clusters after joining?

Can't parallelize model training with sc.parallelize, even tough I can run the same code without parallelizing

Grafana + databricks = True?

Resolved! Performance for pyspark dataframe is very slow after using a @pandas_udf