Topics with Label: Koalas Dataframe

Forum Posts

Sorted by:

by Qarol • New Contributor

12-13-2022 5:14:32 AM

686 Views
2 replies
0 kudos

Work around for The method `pd.groupby.GroupBy.prod()` is not implemented yet.

0I have a database with two columns: name (str) and probability (float).I am running this command:df[['name','probability']].groupby('name').prod()on a Databricks (runtime 7.3) notebook and df is a koalas dataframe.The error I get is:PandasNotImpleme...

Data Engineering

686 Views
2 replies
0 kudos

12-13-2022 5:14:32 AM

View Replies

Latest Reply

alenka
New Contributor II

07-31-2023 7:40:55 AM

0 kudos

The same trouble

0 kudos

07-31-2023 7:40:55 AM

1 More Replies

by PrebenOlsen • New Contributor III

08-26-2022 4:31:47 AM

1553 Views
4 replies
1 kudos

GroupBy in delta live tables fails with error "RuntimeError: Query function must return either a Spark or Koalas DataFrame"

I have a delta live table that I'm trying to run GroupBy on, but getting an error: "RuntimeError: Query function must return either a Spark or Koalas DataFrame". Here is my code:@dlt.table def groups_hierarchy(): df = dlt.read_stream("groups_h...

Data Engineering

1553 Views
4 replies
1 kudos

08-26-2022 4:31:47 AM

View Replies

Latest Reply

Vidula
Honored Contributor

09-15-2022 3:33:44 AM

1 kudos

Hi @Preben Olsen Does @Debayan Mukherjee response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

1 kudos

09-15-2022 3:33:44 AM

3 More Replies

by Anonymous • Not applicable

06-02-2021 4:34:38 PM

653 Views
1 replies
0 kudos

Resolved! Converting between Pandas to Koalas

When and why should I convert b/w a Pandas to Koalas dataframe? What are the implications?

Data Engineering

653 Views
1 replies
0 kudos

06-02-2021 4:34:38 PM

View Replies

Latest Reply

Ryan_Chynoweth
Honored Contributor III

06-04-2021 4:31:00 AM

0 kudos

Koalas is distributed on a Databricks cluster similar to how Spark dataframes are also distributed. Pandas dataframes only live on the spark driver in memory. If you are a pandas user and are using a multi-node cluster then you should use koalas to p...

0 kudos

06-04-2021 4:31:00 AM