Data Engineering

Forum Posts

Sorted by:

by Christine • Contributor II

05-24-2022 11:42:57 PM

9219 Views
9 replies
5 kudos

Resolved! pyspark dataframe empties after it has been saved to delta lake.

Hi, I am facing a problem that I hope to get some help to understand. I have created a function that is supposed to check if the input data already exist in a saved delta table and if not, it should create some calculations and append the new data to...

Data Engineering

9219 Views
9 replies
5 kudos

05-24-2022 11:42:57 PM

View Replies

Latest Reply

SharathE
New Contributor III

09-23-2023 11:04:59 AM

5 kudos

Hi,im also having similar issue ..does creating temp view and reading it again after saving to a table works?? /

5 kudos

09-23-2023 11:04:59 AM

8 More Replies

by SG_ • New Contributor II

03-17-2022 7:40:11 AM

3474 Views
1 replies
2 kudos

How to display Sparklyr table in a clean readable format similar to the output of display()?

There exist a Databricks’s built-in display() function (see documentation here) which allow users to display R or SparkR dataframe in a clean and human readable manner where user can scroll to see all the columns and perform sorting on the columns. S...

Data Engineering

3474 Views
1 replies
2 kudos

03-17-2022 7:40:11 AM

View Replies

Latest Reply

rich_goldberg
New Contributor II

02-21-2023 11:47:52 AM

2 kudos

I found that the display() function returned this issue when it came across date-type fields that were NULL. The following function seemed to fix the problem:library(tidyverse) library(lubridate) display_fixed = function(df) { df %>% ...

2 kudos

02-21-2023 11:47:52 AM

by ratnakarsinha • New Contributor II

04-06-2020 4:36:28 AM

21186 Views
3 replies
0 kudos

How to get full result using DataFrame.Display method

Hi, Dataframe.Display method in Databricks notebook fetches only 1000 rows by default. Is there a way to change this default to display and download full result (more than 1000 rows) in python? Thanks, Ratnakar.

Data Engineering

21186 Views
3 replies
0 kudos

04-06-2020 4:36:28 AM

View Replies

Latest Reply

ramravi
Contributor II

12-22-2022 1:14:07 AM

0 kudos

display method doesn't have the option to choose the number of rows. Use the show method. It is not neat and you can't do visualizations and downloads.

0 kudos

12-22-2022 1:14:07 AM

2 More Replies

by SindhujaRaghupa • New Contributor II

03-21-2018 9:44:37 AM

9478 Views
2 replies
1 kudos

Job aborted due to stage failure: Task 0 in stage 4.0 failed 1 times, most recent failure: Lost task 0.0 in stage 4.0 (TID 4, localhost, executor driver): java.lang.NullPointerException

I have uploaded a csv file which have well formatted data and I was trying to use display(questions) where questions=spark.read.option("header","true").csv("/FileStore/tables/Questions.csv")This is throwing an error as follows:SparkException: Job abo...

Data Engineering

9478 Views
2 replies
1 kudos

03-21-2018 9:44:37 AM

View Replies

Latest Reply

SS2
Valued Contributor

11-29-2022 11:05:45 AM

1 kudos

You can use inferschema

1 kudos

11-29-2022 11:05:45 AM

1 More Replies

by Deiry • New Contributor III

11-21-2022 1:58:30 PM

1163 Views
0 replies
0 kudos

Why is the whole list not displayed in dbutil.widgets.multiselect?

I have been studying the Apache Spark in Databricks Academy and I don't understand why the whole list is nos displayed? Creation of widgets:dbutils.widgets.text("name", "Brickster", "Name") dbutils.widgets.multiselect("colors","orange", ["orange", "r...

Data Engineering

1163 Views
0 replies
0 kudos

11-21-2022 1:58:30 PM

by MrsBaker • Databricks Employee

08-08-2022 10:53:57 AM

1423 Views
1 replies
1 kudos

display() not updating after 1000 rows

Hello folks! I am calling display() on a streaming query sourced from a delta table. The output from display() displays the new rows added to the source table. But as soon as the output results hit 1000 rows, the output is not updated anymore. As a r...

Data Engineering

1423 Views
1 replies
1 kudos

08-08-2022 10:53:57 AM

View Replies

Latest Reply

MrsBaker
Databricks Employee

08-08-2022 1:29:41 PM

1 kudos

aggregate function followed by timestamp field sorted in descending order did the trick:streaming_df.groupBy("field1", "time_field").max("field2").orderBy(col("time_field").desc()).display()

1 kudos

08-08-2022 1:29:41 PM

by Dicer • Valued Contributor

07-02-2022 4:27:46 AM

22383 Views
12 replies
13 kudos

Resolved! Failed to convert Spark.sql to Pandas Dataframe using .toPandas()

I wrote the following code:data = spark.sql (" SELECT A_adjClose, AA_adjClose, AAL_adjClose, AAP_adjClose, AAPL_adjClose FROM deltabase.a_30min_delta, deltabase.aa_30min_delta, deltabase.aal_30min_delta, deltabase.aap_30min_delta ,deltabase.aapl_30m...

Data Engineering

22383 Views
12 replies
13 kudos

07-02-2022 4:27:46 AM

View Replies

Latest Reply

Dicer
Valued Contributor

07-18-2022 11:39:47 PM

13 kudos

I just discovered a solution.Today, I opened Azure Databricks. When I imported python libraries. Databricks told me that toPandas() was deprecated and it suggested me to use toPandas.The following solution works: Use toPandas instead of toPandas() da...

13 kudos

07-18-2022 11:39:47 PM

11 More Replies

by sdaza • New Contributor III

05-29-2018 8:13:21 PM

24581 Views
12 replies
5 kudos

Displaying Pandas Dataframe

I had this issue when displaying pandas data frames. Any ideas on how to display a pandas dataframe? display(mydataframe) Exception: Cannot call display(<class 'pandas.core.frame.DataFrame'>)

Data Engineering

24581 Views
12 replies
5 kudos

05-29-2018 8:13:21 PM

View Replies

Latest Reply

Tim_Green
New Contributor II

06-07-2022 2:13:21 PM

5 kudos

A simple way to get a nicely formatted table from a pandas dataframe:displayHTML(df.to_html())to_html has some parameters you can control the output with. If you want something less basic, try out this code that I wrote that adds scrolling and some ...

5 kudos

06-07-2022 2:13:21 PM

11 More Replies

by antoooks • New Contributor III

10-25-2021 1:10:39 AM

3036 Views
2 replies
4 kudos

Resolved! display() function always return connection refused on tunneling despite successfully retrieving the schema

Hi everyone,I am using SSH tunnelling with SSHTunnelForwarder to reach a target AWS RDS PostgreSQL database. The connection got through, however when I tried to display the retrieved data frame it always throws "connection refused" error. Please see ...

Data Engineering

3036 Views
2 replies
4 kudos

10-25-2021 1:10:39 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

11-12-2021 4:41:41 PM

4 kudos

hi @Kurnianto Trilaksono Sutjipto ,This seems like a connectivity issue with the url you are trying to connect to. It fails during the display() command because read is a lazy transformation and it will not be executed right away. On the other hand,...

4 kudos

11-12-2021 4:41:41 PM

1 More Replies

by magy • New Contributor

10-15-2021 4:33:36 AM

2872 Views
3 replies
0 kudos

Display, count and write commands stuck after 1st job

Hi, I have problems with displaying and saving a table in Databricks. Simple command can run for hours without any progress..Before that I am not doing any rocket science - code runs in less than a minute, I have one join at the end. I am using 7.3 ...

Data Engineering

2872 Views
3 replies
0 kudos

10-15-2021 4:33:36 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

10-18-2021 3:31:46 PM

0 kudos

hi @Just Magy ,what is your data source? what type of lazy transformation and actions do you have in your code? Do you partition your data? Please provide more details.

0 kudos

10-18-2021 3:31:46 PM

2 More Replies

by daindana • New Contributor III

09-29-2021 10:20:17 PM

11658 Views
3 replies
3 kudos

Resolved! Why doesn't my notebook display widgets when I use 'dbutils' while it is displayed with '%sql CREATE WIDGET'?

The widget is not shown when I use dbutils while it works perfect with sql.For example, %sql CREATE WIDGET TEXT state DEFAULT "CA"This one shows me widget.dbutils.widgets.text("name", "Brickster", "Name") dbutils.widgets.multiselect("colors", "oran...

Data Engineering

11658 Views
3 replies
3 kudos

09-29-2021 10:20:17 PM

View Replies

Latest Reply

daindana
New Contributor III

10-11-2021 4:45:54 PM

3 kudos

Hello, Ryan! For some reason, this problem is solved, and now it is working perfectly! I did nothing new, but it is just working now. Thank you!:)

3 kudos

10-11-2021 4:45:54 PM

2 More Replies

by User16844444140 • New Contributor II

06-10-2021 11:01:44 AM

3605 Views
3 replies
0 kudos

Why does the display name of widgets not match the specified name in SQL?

However, I have no problem accessing the widget with the specified name.

Data Engineering

3605 Views
3 replies
0 kudos

06-10-2021 11:01:44 AM

View Replies

Latest Reply

User16844444140
New Contributor II

07-30-2021 2:41:23 PM

0 kudos

Yep, I figured out the issue now. Both of you gave the right information to solve the problem. My first mistake was as Jacob mentioned, `date` is actually a dataframe object here. To get the string date, I had to do similar to what Amine suggested. S...

0 kudos

07-30-2021 2:41:23 PM

2 More Replies

by lycenok • New Contributor II

07-26-2021 6:30:33 PM

986 Views
0 replies
0 kudos

display function eats consecutive spaces

When using display, more than 1 spaces in strings are ignored. Can we change that behaviour? Are there any options for display functions? code example: display( spark.createDataFrame( [ ( 'a a' , 'a a' ) ], [ 'string_column', 'string_column_2' ] )...

Data Engineering

986 Views
0 replies
0 kudos

07-26-2021 6:30:33 PM

by User16752241457 • New Contributor II

06-10-2021 10:57:09 AM

2176 Views
1 replies
0 kudos

Saving display() plots

Is there an easy way I can save the plots generated by the display() cmd?

Data Engineering

2176 Views
1 replies
0 kudos

06-10-2021 10:57:09 AM

View Replies

Latest Reply

User16788317454
New Contributor III

06-10-2021 12:13:25 PM

0 kudos

Plots generated via the display() command are automatically saved under /FileStore/plots. See the documentation for more info: https://docs.databricks.com/data/filestore.html#filestore.However, perhaps an easier approach to save/revisit plots is to u...

0 kudos

06-10-2021 12:13:25 PM

by SergeyIvanchuk • New Contributor

11-16-2018 2:06:02 PM

11213 Views
4 replies
0 kudos

Seaborn plot display in Databricks

I am using Seaborn version 0.7.1 and matplotlib version 1.5.3 The following code does not display a graph in the end. Any idea how to resolve ? (works in Python CLI on my local computer) import seaborn as sns sns.set(style="darkgrid") tips = sns.lo...

Data Engineering

11213 Views
4 replies
0 kudos

11-16-2018 2:06:02 PM

View Replies

Latest Reply

AbbyLemon
New Contributor II

08-04-2020 2:58:33 PM

0 kudos

I found that you create a similar comparison plot as what you get from seaborn by using the display(sparkdf) and adding multiple columns to the 'Values' section while creating a 'Scatter plot'. You get to the 'Customize Plot' by clicking on the icon ...

0 kudos

08-04-2020 2:58:33 PM

3 More Replies