<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Best practices : Silver Layer to Salesforce in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/best-practices-silver-layer-to-salesforce/m-p/129944#M48651</link>
    <description>&lt;P&gt;Need community view to evaluate my solution based best practice&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Problem i am solving is&amp;nbsp;reading match data from a CSV, this was uploaded into a volume , then i&amp;nbsp; clean and transform in data bricks , and then upload it in batches to a custom Salesforce object called Match__c. I track success/failure for each upload and optionally saves any failed records to a CSV.&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;Step #&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Step Name&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Description&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Read CSV into DataFrame&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Reads IPL match data from a CSV file using Spark, then converts it to a Pandas DataFrame.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;2&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Drop Auto-Generated Field&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Removes the match_id column which is not required in Salesforce.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;3&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Rename Columns to Salesforce API Names&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Renames columns in the DataFrame to match Salesforce custom field API names (e.g., Team1__c).&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;4&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Parse and Format Date&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Converts date strings to proper date format (YYYY-MM-DD) and removes rows with invalid dates.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;5&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Map Team Names to Salesforce IDs&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Replaces team names with their corresponding Salesforce Team__c record IDs (for lookups).&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;6&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Clean Picklist Values&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Formats picklist fields (like Stage, WonBy) by standardizing case and trimming spaces.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;7&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Standardize Venue Names&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Maps long venue names to shorter, standardized names and truncates to 40 characters if needed.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;8&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Filter Required Fields&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Keeps only the fields needed for Salesforce and removes rows missing required fields.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;9&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Connect to Salesforce&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Authenticates to Salesforce using credentials from Databricks secrets.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;10&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Prepare for Upload&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Converts all object-type columns to strings to ensure compatibility with the Salesforce API.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;11&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Upload in Batches&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Splits data into batches (max 200 records) and inserts each batch using the Salesforce Bulk API.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;12&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Summarize Upload Results&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Counts and prints the number of successfully and unsuccessfully inserted records.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;13&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Capture and Save Failed Records&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Collects failed records, displays them in Databricks, and saves them to a CSV for investigation.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;14&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Display Sample Results&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Shows a few sample API responses to verify the structure and contents of the insert results.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 27 Aug 2025 19:32:22 GMT</pubDate>
    <dc:creator>ManojkMohan</dc:creator>
    <dc:date>2025-08-27T19:32:22Z</dc:date>
    <item>
      <title>Best practices : Silver Layer to Salesforce</title>
      <link>https://community.databricks.com/t5/data-engineering/best-practices-silver-layer-to-salesforce/m-p/129944#M48651</link>
      <description>&lt;P&gt;Need community view to evaluate my solution based best practice&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Problem i am solving is&amp;nbsp;reading match data from a CSV, this was uploaded into a volume , then i&amp;nbsp; clean and transform in data bricks , and then upload it in batches to a custom Salesforce object called Match__c. I track success/failure for each upload and optionally saves any failed records to a CSV.&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;Step #&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Step Name&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Description&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;1&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Read CSV into DataFrame&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Reads IPL match data from a CSV file using Spark, then converts it to a Pandas DataFrame.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;2&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Drop Auto-Generated Field&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Removes the match_id column which is not required in Salesforce.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;3&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Rename Columns to Salesforce API Names&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Renames columns in the DataFrame to match Salesforce custom field API names (e.g., Team1__c).&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;4&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Parse and Format Date&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Converts date strings to proper date format (YYYY-MM-DD) and removes rows with invalid dates.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;5&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Map Team Names to Salesforce IDs&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Replaces team names with their corresponding Salesforce Team__c record IDs (for lookups).&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;6&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Clean Picklist Values&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Formats picklist fields (like Stage, WonBy) by standardizing case and trimming spaces.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;7&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Standardize Venue Names&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Maps long venue names to shorter, standardized names and truncates to 40 characters if needed.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;8&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Filter Required Fields&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Keeps only the fields needed for Salesforce and removes rows missing required fields.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;9&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Connect to Salesforce&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Authenticates to Salesforce using credentials from Databricks secrets.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;10&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Prepare for Upload&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Converts all object-type columns to strings to ensure compatibility with the Salesforce API.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;11&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Upload in Batches&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Splits data into batches (max 200 records) and inserts each batch using the Salesforce Bulk API.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;12&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Summarize Upload Results&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Counts and prints the number of successfully and unsuccessfully inserted records.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;13&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Capture and Save Failed Records&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Collects failed records, displays them in Databricks, and saves them to a CSV for investigation.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;14&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Display Sample Results&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;Shows a few sample API responses to verify the structure and contents of the insert results.&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Aug 2025 19:32:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/best-practices-silver-layer-to-salesforce/m-p/129944#M48651</guid>
      <dc:creator>ManojkMohan</dc:creator>
      <dc:date>2025-08-27T19:32:22Z</dc:date>
    </item>
    <item>
      <title>Re: Best practices : Silver Layer to Salesforce</title>
      <link>https://community.databricks.com/t5/data-engineering/best-practices-silver-layer-to-salesforce/m-p/129969#M48656</link>
      <description>&lt;P&gt;- skip the pandas conversion&lt;/P&gt;&lt;P&gt;- persist the transformed data in a databricks table and then write to salesforce.&lt;/P&gt;</description>
      <pubDate>Thu, 28 Aug 2025 06:36:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/best-practices-silver-layer-to-salesforce/m-p/129969#M48656</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2025-08-28T06:36:52Z</dc:date>
    </item>
  </channel>
</rss>

