@Ondrej Lostakโ : Hope I understood your question correctly. Please let me know if otherwise after reading the below suggestions.
When you create a visualization for a DataFrame in Databricks, the preview is generated based on a sample of the data. However, when you confirm the visualization and it is counted from all of the data, the visualization should be consistent with the source data.
If you are experiencing inconsistencies with your visualizations after changing the DataFrame, one possible reason could be that the changes you made to the DataFrame affected the distribution or the structure of the data, and thus the visualization needs to be updated accordingly. In this case, you would need to recreate the visualization to ensure it is consistent with the updated DataFrame.
However, if you are making minor changes to the DataFrame, such as renaming columns or filtering rows, and you want to avoid having to recreate the visualization every time, you can try using the cache() method on the DataFrame before creating the visualization. This will cache the DataFrame in memory and improve performance, but it will also ensure that the visualization is consistent with the source data at all times, even after making minor changes.