cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Christine
by Contributor II
  • 7975 Views
  • 9 replies
  • 5 kudos

Resolved! pyspark dataframe empties after it has been saved to delta lake.

Hi, I am facing a problem that I hope to get some help to understand. I have created a function that is supposed to check if the input data already exist in a saved delta table and if not, it should create some calculations and append the new data to...

  • 7975 Views
  • 9 replies
  • 5 kudos
Latest Reply
SharathE
New Contributor III
  • 5 kudos

Hi,im also having similar issue ..does creating temp view and reading it again after saving to a table works?? /

  • 5 kudos
8 More Replies
SG_
by New Contributor II
  • 3171 Views
  • 1 replies
  • 2 kudos

How to display Sparklyr table in a clean readable format similar to the output of display()?

There exist a Databricks’s built-in display() function (see documentation here) which allow users to display R or SparkR dataframe in a clean and human readable manner where user can scroll to see all the columns and perform sorting on the columns. S...

ewFTT.jpg h3hff
  • 3171 Views
  • 1 replies
  • 2 kudos
Latest Reply
rich_goldberg
New Contributor II
  • 2 kudos

I found that the display() function returned this issue when it came across date-type fields that were NULL. The following function seemed to fix the problem:library(tidyverse) library(lubridate)   display_fixed = function(df) { df %>% ...

  • 2 kudos
ratnakarsinha
by New Contributor II
  • 19948 Views
  • 3 replies
  • 0 kudos

How to get full result using DataFrame.Display method

Hi, Dataframe.Display method in Databricks notebook fetches only 1000 rows by default. Is there a way to change this default to display and download full result (more than 1000 rows) in python? Thanks, Ratnakar.

  • 19948 Views
  • 3 replies
  • 0 kudos
Latest Reply
ramravi
Contributor II
  • 0 kudos

display method doesn't have the option to choose the number of rows. Use the show method. It is not neat and you can't do visualizations and downloads.

  • 0 kudos
2 More Replies
SindhujaRaghupa
by New Contributor II
  • 8356 Views
  • 2 replies
  • 1 kudos

Job aborted due to stage failure: Task 0 in stage 4.0 failed 1 times, most recent failure: Lost task 0.0 in stage 4.0 (TID 4, localhost, executor driver): java.lang.NullPointerException

I have uploaded a csv file which have well formatted data and I was trying to use display(questions) where questions=spark.read.option("header","true").csv("/FileStore/tables/Questions.csv")This is throwing an error as follows:SparkException: Job abo...

  • 8356 Views
  • 2 replies
  • 1 kudos
Latest Reply
SS2
Valued Contributor
  • 1 kudos

You can use inferschema​

  • 1 kudos
1 More Replies
Deiry
by New Contributor III
  • 1032 Views
  • 0 replies
  • 0 kudos

Why is the whole list not displayed in dbutil.widgets.multiselect?

I have been studying the Apache Spark in Databricks Academy and I don't understand why the whole list is nos displayed? Creation of widgets:dbutils.widgets.text("name", "Brickster", "Name") dbutils.widgets.multiselect("colors","orange", ["orange", "r...

image
  • 1032 Views
  • 0 replies
  • 0 kudos
MrsBaker
by Databricks Employee
  • 1286 Views
  • 1 replies
  • 1 kudos

display() not updating after 1000 rows

Hello folks! I am calling display() on a streaming query sourced from a delta table. The output from display() displays the new rows added to the source table. But as soon as the output results hit 1000 rows, the output is not updated anymore. As a r...

  • 1286 Views
  • 1 replies
  • 1 kudos
Latest Reply
MrsBaker
Databricks Employee
  • 1 kudos

aggregate function followed by timestamp field sorted in descending order did the trick:streaming_df.groupBy("field1", "time_field").max("field2").orderBy(col("time_field").desc()).display()

  • 1 kudos
Dicer
by Valued Contributor
  • 19527 Views
  • 12 replies
  • 13 kudos

Resolved! Failed to convert Spark.sql to Pandas Dataframe using .toPandas()

I wrote the following code:​data = spark.sql (" SELECT A_adjClose, AA_adjClose, AAL_adjClose, AAP_adjClose, AAPL_adjClose FROM deltabase.a_30min_delta, deltabase.aa_30min_delta, deltabase.aal_30min_delta, deltabase.aap_30min_delta ,deltabase.aapl_30m...

  • 19527 Views
  • 12 replies
  • 13 kudos
Latest Reply
Dicer
Valued Contributor
  • 13 kudos

I just discovered a solution.Today, I opened Azure Databricks. When I imported python libraries. Databricks told me that toPandas() was deprecated and it suggested me to use toPandas.The following solution works: Use toPandas instead of toPandas() da...

  • 13 kudos
11 More Replies
sdaza
by New Contributor III
  • 22806 Views
  • 12 replies
  • 5 kudos

Displaying Pandas Dataframe

I had this issue when displaying pandas data frames. Any ideas on how to display a pandas dataframe? display(mydataframe) Exception: Cannot call display(<class 'pandas.core.frame.DataFrame'>)

  • 22806 Views
  • 12 replies
  • 5 kudos
Latest Reply
Tim_Green
New Contributor II
  • 5 kudos

A simple way to get a nicely formatted table from a pandas dataframe:displayHTML(df.to_html())to_html has some parameters you can control the output with. If you want something less basic, try out this code that I wrote that adds scrolling and some ...

  • 5 kudos
11 More Replies
antoooks
by New Contributor III
  • 2686 Views
  • 2 replies
  • 4 kudos

Resolved! display() function always return connection refused on tunneling despite successfully retrieving the schema

Hi everyone,I am using SSH tunnelling with SSHTunnelForwarder to reach a target AWS RDS PostgreSQL database. The connection got through, however when I tried to display the retrieved data frame it always throws "connection refused" error. Please see ...

image.png
  • 2686 Views
  • 2 replies
  • 4 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 4 kudos

hi @Kurnianto Trilaksono Sutjipto​ ,This seems like a connectivity issue with the url you are trying to connect to. It fails during the display() command because read is a lazy transformation and it will not be executed right away. On the other hand,...

  • 4 kudos
1 More Replies
magy
by New Contributor
  • 2589 Views
  • 3 replies
  • 0 kudos

Display, count and write commands stuck after 1st job

Hi, I have problems with displaying and saving a table in Databricks. Simple command can run for hours without any progress..Before that I am not doing any rocket science - code runs in less than a minute, I have one join at the end. I am using 7.3 ...

image
  • 2589 Views
  • 3 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

hi @Just Magy​ ,what is your data source? what type of lazy transformation and actions do you have in your code? Do you partition your data? Please provide more details.

  • 0 kudos
2 More Replies
daindana
by New Contributor III
  • 10480 Views
  • 3 replies
  • 3 kudos

Resolved! Why doesn't my notebook display widgets when I use 'dbutils' while it is displayed with '%sql CREATE WIDGET'?

The widget is not shown when I use dbutils while it works perfect with sql.For example, %sql   CREATE WIDGET TEXT state DEFAULT "CA"This one shows me widget.dbutils.widgets.text("name", "Brickster", "Name") dbutils.widgets.multiselect("colors", "oran...

dbutils get info from widget dbutils widget creation
  • 10480 Views
  • 3 replies
  • 3 kudos
Latest Reply
daindana
New Contributor III
  • 3 kudos

Hello, Ryan! For some reason, this problem is solved, and now it is working perfectly! I did nothing new, but it is just working now. Thank you!:)

  • 3 kudos
2 More Replies
User16844444140
by New Contributor II
  • 3130 Views
  • 3 replies
  • 0 kudos

Why does the display name of widgets not match the specified name in SQL?

However, I have no problem accessing the widget with the specified name.

Screen Shot 2021-03-18 at 2.07.34 PM
  • 3130 Views
  • 3 replies
  • 0 kudos
Latest Reply
User16844444140
New Contributor II
  • 0 kudos

Yep, I figured out the issue now. Both of you gave the right information to solve the problem. My first mistake was as Jacob mentioned, `date` is actually a dataframe object here. To get the string date, I had to do similar to what Amine suggested. S...

  • 0 kudos
2 More Replies
lycenok
by New Contributor II
  • 903 Views
  • 0 replies
  • 0 kudos

display function eats consecutive spaces

When using display, more than 1 spaces in strings are ignored. Can we change that behaviour? Are there any options for display functions? code example: display( spark.createDataFrame( [ ( 'a a' , 'a a' ) ], [ 'string_column', 'string_column_2' ] )...

  • 903 Views
  • 0 replies
  • 0 kudos
User16752241457
by New Contributor II
  • 1942 Views
  • 1 replies
  • 0 kudos

Saving display() plots

Is there an easy way I can save the plots generated by the display() cmd?

  • 1942 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16788317454
New Contributor III
  • 0 kudos

Plots generated via the display() command are automatically saved under /FileStore/plots. See the documentation for more info: https://docs.databricks.com/data/filestore.html#filestore.However, perhaps an easier approach to save/revisit plots is to u...

  • 0 kudos
SergeyIvanchuk
by New Contributor
  • 10252 Views
  • 4 replies
  • 0 kudos

Seaborn plot display in Databricks

I am using Seaborn version 0.7.1 and matplotlib version 1.5.3 The following code does not display a graph in the end. Any idea how to resolve ? (works in Python CLI on my local computer) import seaborn as sns sns.set(style="darkgrid") tips = sns.lo...

  • 10252 Views
  • 4 replies
  • 0 kudos
Latest Reply
AbbyLemon
New Contributor II
  • 0 kudos

I found that you create a similar comparison plot as what you get from seaborn by using the display(sparkdf) and adding multiple columns to the 'Values' section while creating a 'Scatter plot'. You get to the 'Customize Plot' by clicking on the icon ...

  • 0 kudos
3 More Replies
Labels