Salesforce Bulk API 2.0 not getting all rows from large table

AlanDanque — Tue, 08 Jul 2025 14:46:16 GMT

Has anyone run into an incomplete data extraction issue with the Salesforce Bulk API 2.0 where very large source object tables with more than 260k rows (s/b approx 13M) - result in only extracting approx 250k on attempt?

Re: Salesforce Bulk API 2.0 not getting all rows from large table

Krishna_S — Sun, 12 Oct 2025 05:30:57 GMT

@AlanDanque

The only reason you are seeing fewer records is that you don't have access to all the rows for that table.

Can you confirm that at your end?

Re: Salesforce Bulk API 2.0 not getting all rows from large table

ManojkMohan — Sun, 12 Oct 2025 21:06:53 GMT

@AlanDanque I am working on a similar use case and will share screen shots shortly

But to reach the root cause can you share the below details

Checks at Salesforce	Description
Header used?	Was Sforce-Enable-PKChunking: chunkSize=250000 explicitly included in the job request header?
Header honored?	Salesforce logs show chunked job with multiple batch IDs? Or only one batch returned?
Logs?	Job shows status Completed but result set is only 1 file?
Object supported?	Not all standard or custom objects support PK chunking; confirm Salesforce docs.

Checks at Databricks	Description
File Count Check	Check if the number of result files (CSV chunks) is greater than 1. If there’s only 1 file, chunking likely didn't happen, or job was not split correctly. Use: dbutils.fs.ls("/mnt/tmp/salesforce_chunks/")
Row Count Validation	After ingestion, check that the row count in the Delta table is close to expected (~13M). A record count of ~250K indicates silent truncation. Use: df.count()
Chunk Metadata Logging	Log the number of records per chunk/file during ingestion. This helps detect dropped or corrupted chunks. Log: filename, record count, chunk ID (if available)
Failed Chunk Detection	Look for missing or partial chunk downloads. If Salesforce returns 4 result files and only 3 are downloaded, something failed silently. Implement: Logging after each download attempt.
Job Status Check	Before downloading, check the job status from Salesforce via API. If JobComplete is false or a batch is in Failed, Databricks shouldn't proceed with ingestion.Use: API polling in notebook

topic Re: Salesforce Bulk API 2.0 not getting all rows from large table in Data Engineering

Salesforce Bulk API 2.0 not getting all rows from large table

Re: Salesforce Bulk API 2.0 not getting all rows from large table

Re: Salesforce Bulk API 2.0 not getting all rows from large table