<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Visulization only from sample of data in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/visulization-only-from-sample-of-data/m-p/7970#M3704</link>
    <description>&lt;P&gt;Hi @Ondrej Lostak​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Hope everything is going great.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Cheers!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Sat, 01 Apr 2023 00:53:13 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2023-04-01T00:53:13Z</dc:date>
    <item>
      <title>Visulization only from sample of data</title>
      <link>https://community.databricks.com/t5/data-engineering/visulization-only-from-sample-of-data/m-p/7968#M3702</link>
      <description>&lt;P&gt;When I display dataframe and add visualization, I can see a preview from only a sample of data, and when I confirm it, it is counted from all of the data. Until now, everything is fine. However, when I change the dataframe, the visualization is inconsistent and only considere a sample of the data, so I need to create the visualization again. This makes the visualizations a little bit unfriendly for me. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is there a way how set the visualiztion, so it is consitent with the source data all the time?&lt;/P&gt;</description>
      <pubDate>Fri, 10 Mar 2023 09:23:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/visulization-only-from-sample-of-data/m-p/7968#M3702</guid>
      <dc:creator>Ondrej_Lostak</dc:creator>
      <dc:date>2023-03-10T09:23:09Z</dc:date>
    </item>
    <item>
      <title>Re: Visulization only from sample of data</title>
      <link>https://community.databricks.com/t5/data-engineering/visulization-only-from-sample-of-data/m-p/7969#M3703</link>
      <description>&lt;P&gt;@Ondrej Lostak​&amp;nbsp;: Hope I understood your question correctly. Please let me know if otherwise after reading the below suggestions.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;When you create a visualization for a DataFrame in Databricks, the preview is generated based on a sample of the data. However, when you confirm the visualization and it is counted from all of the data, the visualization should be consistent with the source data.&lt;/P&gt;&lt;P&gt;If you are experiencing inconsistencies with your visualizations after changing the DataFrame, one possible reason could be that the changes you made to the DataFrame affected the distribution or the structure of the data, and thus the visualization needs to be updated accordingly. In this case, you would need to recreate the visualization to ensure it is consistent with the updated DataFrame.&lt;/P&gt;&lt;P&gt;However, if you are making minor changes to the DataFrame, such as renaming columns or filtering rows, and you want to avoid having to recreate the visualization every time, you can try using the cache() method on the DataFrame before creating the visualization. This will cache the DataFrame in memory and improve performance, but it will also ensure that the visualization is consistent with the source data at all times, even after making minor changes.&lt;/P&gt;</description>
      <pubDate>Tue, 14 Mar 2023 09:06:36 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/visulization-only-from-sample-of-data/m-p/7969#M3703</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-03-14T09:06:36Z</dc:date>
    </item>
    <item>
      <title>Re: Visulization only from sample of data</title>
      <link>https://community.databricks.com/t5/data-engineering/visulization-only-from-sample-of-data/m-p/7970#M3704</link>
      <description>&lt;P&gt;Hi @Ondrej Lostak​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Hope everything is going great.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Cheers!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 01 Apr 2023 00:53:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/visulization-only-from-sample-of-data/m-p/7970#M3704</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-04-01T00:53:13Z</dc:date>
    </item>
  </channel>
</rss>

