Topics with Label: Spark checkpoint

by alejandrofm • Valued Contributor

03-31-2022 7:39:01 AM

6234 Views
8 replies
9 kudos

Resolved! Pandas.spark.checkpoint() doesn't broke lineage

Hi, I'm doing some something simple on Databricks notebook:spark.sparkContext.setCheckpointDir("/tmp/") import pyspark.pandas as ps sql=("""select field1, field2 From table Where date>='2021-01.01""") df = ps.sql(sql) df.spark.checkpoint()That...

Data Engineering

6234 Views
8 replies
9 kudos

03-31-2022 7:39:01 AM

View Replies

Latest Reply

annafina
New Contributor II

11-21-2024 6:34:04 AM

9 kudos

checkpoint() returns a checkpointed DataFrame, so you need to assign it to a new variable:checkpointedDF = df.checkpoint()

9 kudos

11-21-2024 6:34:04 AM

7 More Replies

by RohanB • New Contributor III

01-27-2022 2:39:37 AM

5697 Views
8 replies
3 kudos

Resolved! Spark Streaming - Checkpoint State EOF Exception

I have a Spark Structured Streaming job which reads from 2 Delta tables in streams , processes the data and then writes to a 3rd Delta table. The job is being run with the Databricks service on GCP.Sometimes the job fails with the following exception...

Data Engineering

5697 Views
8 replies
3 kudos

01-27-2022 2:39:37 AM

View Replies

Latest Reply

RohanB
New Contributor III

02-15-2022 4:27:03 AM

3 kudos

Hi @Jose Gonzalez ,Do you require any more information regarding the code? Any idea what could be cause for the issue?Thanks and Regards,Rohan

3 kudos

02-15-2022 4:27:03 AM

7 More Replies

Databricks Community

Forum Posts

Resolved! Pandas.spark.checkpoint() doesn't broke lineage

Resolved! Spark Streaming - Checkpoint State EOF Exception