<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic optimizing my databricks code in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/optimizing-my-databricks-code/m-p/149743#M53169</link>
    <description>&lt;P&gt;I have the following code in databricks under serverless and i want to know how to improve it to make it more efficient and run faster without having the results change (standard errors change slightly when i try to improve it):&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# Databricks Serverless - Taylor Row Percent&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# ============================================================&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# Runs on Databricks Serverless (pandas on Spark / pandas UDF pattern).&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# The core estimation logic is kept in pandas because Taylor&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# linearisation and survey-variance math has no native Spark equivalent.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# Usage&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# -----&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# Input &amp;nbsp;: pass a pandas DataFrame directly, OR a Spark DataFrame&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(it will be converted automatically via .toPandas()).&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# Output : pandas DataFrame (call display() or wrap in spark.createDataFrame()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;if you need a Spark / Delta result).&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# ============================================================&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;from&lt;/SPAN&gt; &lt;SPAN&gt;__future__&lt;/SPAN&gt; &lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; annotations&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; numpy &lt;/SPAN&gt;&lt;SPAN&gt;as&lt;/SPAN&gt;&lt;SPAN&gt; np&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; pandas &lt;/SPAN&gt;&lt;SPAN&gt;as&lt;/SPAN&gt;&lt;SPAN&gt; pd&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;from&lt;/SPAN&gt;&lt;SPAN&gt; itertools &lt;/SPAN&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; product&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;from&lt;/SPAN&gt;&lt;SPAN&gt; typing &lt;/SPAN&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; Dict, List, Optional, Union&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;# Databricks / PySpark imports (available on every Serverless cluster)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;from&lt;/SPAN&gt;&lt;SPAN&gt; pyspark.sql &lt;/SPAN&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; DataFrame &lt;/SPAN&gt;&lt;SPAN&gt;as&lt;/SPAN&gt;&lt;SPAN&gt; SparkDataFrame&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# Helper: accept either a Spark or pandas DataFrame&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;_to_pandas&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;df&lt;/SPAN&gt;&lt;SPAN&gt;: Union[pd.DataFrame, SparkDataFrame]) -&amp;gt; pd.DataFrame:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"""&lt;/SPAN&gt;&lt;SPAN&gt;Convert Spark DataFrame → pandas if needed.&lt;/SPAN&gt;&lt;SPAN&gt;"""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;isinstance&lt;/SPAN&gt;&lt;SPAN&gt;(df, SparkDataFrame):&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt; df.&lt;/SPAN&gt;&lt;SPAN&gt;toPandas&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt; df.&lt;/SPAN&gt;&lt;SPAN&gt;copy&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# Main function&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;taylor_row_percent&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;df&lt;/SPAN&gt;&lt;SPAN&gt;: Union[pd.DataFrame, SparkDataFrame],&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;row_domains&lt;/SPAN&gt;&lt;SPAN&gt;: List[&lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;],&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;measure&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;exclusions&lt;/SPAN&gt;&lt;SPAN&gt;: Dict[&lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;, List],&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;weight_col&lt;/SPAN&gt;&lt;SPAN&gt;: Optional[&lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;strata_col&lt;/SPAN&gt;&lt;SPAN&gt;: Optional[&lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;fpc&lt;/SPAN&gt;&lt;SPAN&gt;: Optional[Union[Dict[&lt;/SPAN&gt;&lt;SPAN&gt;object&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;float&lt;/SPAN&gt;&lt;SPAN&gt;], &lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;]] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;p_01_adjust&lt;/SPAN&gt;&lt;SPAN&gt;: Optional[&lt;/SPAN&gt;&lt;SPAN&gt;float&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;1.0&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;zcritical&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;float&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;1.96&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;output_spark&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;bool&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;False&lt;/SPAN&gt;&lt;SPAN&gt;, &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;# set True to get a Spark DataFrame back&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;output_delta_table&lt;/SPAN&gt;&lt;SPAN&gt;: Optional[&lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN&gt;, &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;# e.g. "catalog.schema.table"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;) -&amp;gt; Union[pd.DataFrame, SparkDataFrame]:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; Taylor-linearisation row-percent estimator.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; Parameters&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; ----------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df : pandas or Spark DataFrame&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; row_domains : columns that define the row grouping&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; measure : column whose value distribution is estimated&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; exclusions : {col: [values_to_exclude]} &amp;nbsp;– NaN supported&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; weight_col : survey weight column (None → unweighted)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; strata_col : stratification column (None → single stratum)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; fpc : finite-population correction – dict {stratum: N} or column name&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; p_01_adjust : Agresti-Coull scaling factor for p=0/1 cells&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; zcritical : z-value for confidence interval (default 1.96)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; output_spark : if True, return a Spark DataFrame instead of pandas&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; output_delta_table : if set, write result to this Delta table and return it&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 0. Normalise input&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;_to_pandas&lt;/SPAN&gt;&lt;SPAN&gt;(df)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 1. Weight handling&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; weight_col &lt;/SPAN&gt;&lt;SPAN&gt;is&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df[&lt;/SPAN&gt;&lt;SPAN&gt;"_w"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;1.0&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; weight_col &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"_w"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 2. Strata handling&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; strata_col &lt;/SPAN&gt;&lt;SPAN&gt;is&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df[&lt;/SPAN&gt;&lt;SPAN&gt;"_strata"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"all"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; strata_col &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"_strata"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; fpc &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 3. FPC handling&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; fpc &lt;/SPAN&gt;&lt;SPAN&gt;is&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df[&lt;/SPAN&gt;&lt;SPAN&gt;"_N_h"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; np.inf&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; fpc_col &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"_N_h"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;elif&lt;/SPAN&gt; &lt;SPAN&gt;isinstance&lt;/SPAN&gt;&lt;SPAN&gt;(fpc, &lt;/SPAN&gt;&lt;SPAN&gt;dict&lt;/SPAN&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df[&lt;/SPAN&gt;&lt;SPAN&gt;"_N_h"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df[strata_col].&lt;/SPAN&gt;&lt;SPAN&gt;map&lt;/SPAN&gt;&lt;SPAN&gt;(fpc)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; fpc_col &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"_N_h"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; fpc_col &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; fpc&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 4. Scope flag &amp;nbsp;(mirrors SAS exclusion logic)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df[&lt;/SPAN&gt;&lt;SPAN&gt;"scope_fg"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;1&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; var, excl_vals &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; exclusions.&lt;/SPAN&gt;&lt;SPAN&gt;items&lt;/SPAN&gt;&lt;SPAN&gt;():&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; val &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; excl_vals:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; pd.&lt;/SPAN&gt;&lt;SPAN&gt;isna&lt;/SPAN&gt;&lt;SPAN&gt;(val):&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df.loc[df[var].&lt;/SPAN&gt;&lt;SPAN&gt;isna&lt;/SPAN&gt;&lt;SPAN&gt;(), &lt;/SPAN&gt;&lt;SPAN&gt;"scope_fg"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df.loc[df[var] &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt;&lt;SPAN&gt; val, &lt;/SPAN&gt;&lt;SPAN&gt;"scope_fg"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 5. Domain &amp;amp; measure levels &amp;nbsp;(in-scope records only)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; domain_levels &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; {&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; c: df.loc[df[&lt;/SPAN&gt;&lt;SPAN&gt;"scope_fg"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt; &lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;, c].&lt;/SPAN&gt;&lt;SPAN&gt;dropna&lt;/SPAN&gt;&lt;SPAN&gt;().&lt;/SPAN&gt;&lt;SPAN&gt;unique&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; c &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; row_domains&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; }&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; measure_levels &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df.loc[df[&lt;/SPAN&gt;&lt;SPAN&gt;"scope_fg"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt; &lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;, measure].&lt;/SPAN&gt;&lt;SPAN&gt;dropna&lt;/SPAN&gt;&lt;SPAN&gt;().&lt;/SPAN&gt;&lt;SPAN&gt;unique&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; all_rows &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;list&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;product&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;domain_levels.&lt;/SPAN&gt;&lt;SPAN&gt;values&lt;/SPAN&gt;&lt;SPAN&gt;(), measure_levels))&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; results &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; []&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 6. Main estimation loop&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; combo &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; all_rows:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; dom_vals &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; combo[:&lt;/SPAN&gt;&lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;]&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; measure_val &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; combo[&lt;/SPAN&gt;&lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;]&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# Domain mask (scope == 1 only)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; mask_row &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df[&lt;/SPAN&gt;&lt;SPAN&gt;"scope_fg"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt; &lt;SPAN&gt;1&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; c, v &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt; &lt;SPAN&gt;zip&lt;/SPAN&gt;&lt;SPAN&gt;(row_domains, dom_vals):&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; mask_row &lt;/SPAN&gt;&lt;SPAN&gt;&amp;amp;=&lt;/SPAN&gt;&lt;SPAN&gt; (df[c] &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt;&lt;SPAN&gt; v)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; mask_cell &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; mask_row &lt;/SPAN&gt;&lt;SPAN&gt;&amp;amp;&lt;/SPAN&gt;&lt;SPAN&gt; (df[measure] &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt;&lt;SPAN&gt; measure_val)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; n_row &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;int&lt;/SPAN&gt;&lt;SPAN&gt;(mask_row.&lt;/SPAN&gt;&lt;SPAN&gt;sum&lt;/SPAN&gt;&lt;SPAN&gt;())&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; n_cell &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;int&lt;/SPAN&gt;&lt;SPAN&gt;(mask_cell.&lt;/SPAN&gt;&lt;SPAN&gt;sum&lt;/SPAN&gt;&lt;SPAN&gt;())&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ---- domain empty ----------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; n_row &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; p_hat &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; se_p &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; W_row &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df.loc[mask_row, &amp;nbsp;weight_col].&lt;/SPAN&gt;&lt;SPAN&gt;sum&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; W_cell &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df.loc[mask_cell, weight_col].&lt;/SPAN&gt;&lt;SPAN&gt;sum&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; p_hat &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; W_cell &lt;/SPAN&gt;&lt;SPAN&gt;/&lt;/SPAN&gt;&lt;SPAN&gt; W_row &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; W_row &lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt; &lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# Taylor linearisation&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df2 &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df.&lt;/SPAN&gt;&lt;SPAN&gt;copy&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df2[&lt;/SPAN&gt;&lt;SPAN&gt;"R"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; mask_row.&lt;/SPAN&gt;&lt;SPAN&gt;astype&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;float&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df2[&lt;/SPAN&gt;&lt;SPAN&gt;"C"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; mask_cell.&lt;/SPAN&gt;&lt;SPAN&gt;astype&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;float&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df2[&lt;/SPAN&gt;&lt;SPAN&gt;"u"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; (df2[weight_col] &lt;/SPAN&gt;&lt;SPAN&gt;/&lt;/SPAN&gt;&lt;SPAN&gt; W_row) &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df2[&lt;/SPAN&gt;&lt;SPAN&gt;"C"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt; p_hat &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt; df2[&lt;/SPAN&gt;&lt;SPAN&gt;"R"&lt;/SPAN&gt;&lt;SPAN&gt;]&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; )&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; var_p &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;0.0&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; h, g &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; df2.&lt;/SPAN&gt;&lt;SPAN&gt;groupby&lt;/SPAN&gt;&lt;SPAN&gt;(strata_col):&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; n_h &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;len&lt;/SPAN&gt;&lt;SPAN&gt;(g)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; n_h &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;=&lt;/SPAN&gt; &lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;continue&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; N_h &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; g[fpc_col].iloc[&lt;/SPAN&gt;&lt;SPAN&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;]&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; f_h &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; n_h &lt;/SPAN&gt;&lt;SPAN&gt;/&lt;/SPAN&gt;&lt;SPAN&gt; N_h &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isfinite&lt;/SPAN&gt;&lt;SPAN&gt;(N_h) &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt; &lt;SPAN&gt;0.0&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; S2 &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; g[&lt;/SPAN&gt;&lt;SPAN&gt;"u"&lt;/SPAN&gt;&lt;SPAN&gt;].&lt;/SPAN&gt;&lt;SPAN&gt;var&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;ddof&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; var_p &lt;/SPAN&gt;&lt;SPAN&gt;+=&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt; &lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt; f_h) &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt; n_h &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt; S2&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; se_p &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;sqrt&lt;/SPAN&gt;&lt;SPAN&gt;(var_p) &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; var_p &lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt; &lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ---- Agresti-Coull &amp;nbsp;(SAS exact logic) --------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; adj_p &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; p_hat&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; adj_se &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; se_p&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; p_01_adjust &lt;/SPAN&gt;&lt;SPAN&gt;is&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;and&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(p_hat)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;and&lt;/SPAN&gt;&lt;SPAN&gt; p_hat &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;SPAN&gt;0.0&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;1.0&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;and&lt;/SPAN&gt;&lt;SPAN&gt; n_row &lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; adj_p &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; (n_cell &lt;/SPAN&gt;&lt;SPAN&gt;+&lt;/SPAN&gt; &lt;SPAN&gt;2&lt;/SPAN&gt;&lt;SPAN&gt;) &lt;/SPAN&gt;&lt;SPAN&gt;/&lt;/SPAN&gt;&lt;SPAN&gt; (n_row &lt;/SPAN&gt;&lt;SPAN&gt;+&lt;/SPAN&gt; &lt;SPAN&gt;4&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; adj_se &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;sqrt&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; adj_p &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt; &lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt; adj_p) &lt;/SPAN&gt;&lt;SPAN&gt;/&lt;/SPAN&gt;&lt;SPAN&gt; (n_row &lt;/SPAN&gt;&lt;SPAN&gt;+&lt;/SPAN&gt; &lt;SPAN&gt;4&lt;/SPAN&gt;&lt;SPAN&gt;) &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt; p_01_adjust&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; )&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ---- Final statistics ------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; rowpercent &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; p_hat &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt; &lt;SPAN&gt;100&lt;/SPAN&gt; &lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(p_hat) &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; rowstderr &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; se_p &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt; &lt;SPAN&gt;100&lt;/SPAN&gt; &lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(se_p) &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; adj_rowpercent &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; adj_p &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt; &lt;SPAN&gt;100&lt;/SPAN&gt; &lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(adj_p) &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; adj_rowstderr &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; adj_se &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt; &lt;SPAN&gt;100&lt;/SPAN&gt; &lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(adj_se) &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; moe &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; zcritical &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt; adj_rowstderr &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(adj_rowstderr) &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; lower &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; rowpercent &lt;/SPAN&gt;&lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt; moe &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(moe) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; upper &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; rowpercent &lt;/SPAN&gt;&lt;SPAN&gt;+&lt;/SPAN&gt;&lt;SPAN&gt; moe &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(moe) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; rse &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (adj_rowstderr &lt;/SPAN&gt;&lt;SPAN&gt;/&lt;/SPAN&gt;&lt;SPAN&gt; adj_rowpercent) &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt; &lt;SPAN&gt;100&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; adj_rowpercent &lt;/SPAN&gt;&lt;SPAN&gt;not&lt;/SPAN&gt; &lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;SPAN&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;, np.nan) &lt;/SPAN&gt;&lt;SPAN&gt;and&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(adj_rowpercent)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; )&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ---- Formatting &amp;nbsp;(SAS style) -----------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; percent_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;rowpercent&lt;/SPAN&gt;&lt;SPAN&gt;:4.1f}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt; &lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(rowpercent) &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt; &lt;SPAN&gt;""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; moe_new &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;moe&lt;/SPAN&gt;&lt;SPAN&gt;:4.1f}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(moe) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt; &lt;SPAN&gt;""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; lower_new &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;max&lt;/SPAN&gt;&lt;SPAN&gt;(lower, &lt;/SPAN&gt;&lt;SPAN&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;SPAN&gt;:4.1f}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(lower) &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt; &lt;SPAN&gt;""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; upper_new &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;min&lt;/SPAN&gt;&lt;SPAN&gt;(upper, &lt;/SPAN&gt;&lt;SPAN&gt;100&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;SPAN&gt;:4.1f}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt; &lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(upper) &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt; &lt;SPAN&gt;""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; rse_new &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;rse&lt;/SPAN&gt;&lt;SPAN&gt;:4.1f}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(rse) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt; &lt;SPAN&gt;""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# Star rule&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(moe) &lt;/SPAN&gt;&lt;SPAN&gt;and&lt;/SPAN&gt;&lt;SPAN&gt; moe &lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;=&lt;/SPAN&gt; &lt;SPAN&gt;10&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; percent_new &lt;/SPAN&gt;&lt;SPAN&gt;+=&lt;/SPAN&gt; &lt;SPAN&gt;"*"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ---- Suppression &amp;nbsp;(SOS rules) ----------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; n_row &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; percent_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; moe_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; lower_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; upper_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; rse_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"na"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;elif&lt;/SPAN&gt;&lt;SPAN&gt; n_row &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt; &lt;SPAN&gt;range&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;6&lt;/SPAN&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; percent_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; moe_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; lower_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; upper_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; rse_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"np"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ---- Store row -------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; row &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;dict&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;zip&lt;/SPAN&gt;&lt;SPAN&gt;(row_domains, dom_vals))&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; row[measure] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; measure_val&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; row.&lt;/SPAN&gt;&lt;SPAN&gt;update&lt;/SPAN&gt;&lt;SPAN&gt;({&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"domain_frequency"&lt;/SPAN&gt;&lt;SPAN&gt;: n_row,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"Frequency"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;n_cell,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"RowPercent"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; rowpercent,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"RowStdErr"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;rowstderr,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"adj_rowpercent"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; adj_rowpercent,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"adj_rowstderr"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp;adj_rowstderr,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"moe"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;moe,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"lower"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;lower,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"upper"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;upper,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"rse"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;rse,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"percent_new"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp;percent_new,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"moe_new"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;moe_new,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"lower_new"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;lower_new,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"upper_new"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;upper_new,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"rse_new"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;rse_new,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; })&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; results.&lt;/SPAN&gt;&lt;SPAN&gt;append&lt;/SPAN&gt;&lt;SPAN&gt;(row)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 7. Build output&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; result_pdf &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; pd.&lt;/SPAN&gt;&lt;SPAN&gt;DataFrame&lt;/SPAN&gt;&lt;SPAN&gt;(results)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# Optional: write to Delta table&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; output_delta_table:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; spark.&lt;/SPAN&gt;&lt;SPAN&gt;createDataFrame&lt;/SPAN&gt;&lt;SPAN&gt;(result_pdf) &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# noqa: F821 &amp;nbsp;# spark injected by Databricks&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;.write&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;.&lt;/SPAN&gt;&lt;SPAN&gt;format&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"delta"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;.&lt;/SPAN&gt;&lt;SPAN&gt;mode&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"overwrite"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;.&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"overwriteSchema"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"true"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;.&lt;/SPAN&gt;&lt;SPAN&gt;saveAsTable&lt;/SPAN&gt;&lt;SPAN&gt;(output_delta_table)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; )&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;print&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"Results written to Delta table: &lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;output_delta_table&lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# Optional: return Spark DataFrame&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; output_spark &lt;/SPAN&gt;&lt;SPAN&gt;or&lt;/SPAN&gt;&lt;SPAN&gt; output_delta_table:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt; spark.&lt;/SPAN&gt;&lt;SPAN&gt;createDataFrame&lt;/SPAN&gt;&lt;SPAN&gt;(result_pdf) &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;# noqa: F821&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt; result_pdf&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
    <pubDate>Wed, 04 Mar 2026 00:30:59 GMT</pubDate>
    <dc:creator>Jake3</dc:creator>
    <dc:date>2026-03-04T00:30:59Z</dc:date>
    <item>
      <title>optimizing my databricks code</title>
      <link>https://community.databricks.com/t5/data-engineering/optimizing-my-databricks-code/m-p/149743#M53169</link>
      <description>&lt;P&gt;I have the following code in databricks under serverless and i want to know how to improve it to make it more efficient and run faster without having the results change (standard errors change slightly when i try to improve it):&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# Databricks Serverless - Taylor Row Percent&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# ============================================================&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# Runs on Databricks Serverless (pandas on Spark / pandas UDF pattern).&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# The core estimation logic is kept in pandas because Taylor&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# linearisation and survey-variance math has no native Spark equivalent.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# Usage&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# -----&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# Input &amp;nbsp;: pass a pandas DataFrame directly, OR a Spark DataFrame&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(it will be converted automatically via .toPandas()).&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# Output : pandas DataFrame (call display() or wrap in spark.createDataFrame()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;if you need a Spark / Delta result).&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# ============================================================&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;from&lt;/SPAN&gt; &lt;SPAN&gt;__future__&lt;/SPAN&gt; &lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; annotations&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; numpy &lt;/SPAN&gt;&lt;SPAN&gt;as&lt;/SPAN&gt;&lt;SPAN&gt; np&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; pandas &lt;/SPAN&gt;&lt;SPAN&gt;as&lt;/SPAN&gt;&lt;SPAN&gt; pd&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;from&lt;/SPAN&gt;&lt;SPAN&gt; itertools &lt;/SPAN&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; product&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;from&lt;/SPAN&gt;&lt;SPAN&gt; typing &lt;/SPAN&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; Dict, List, Optional, Union&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;# Databricks / PySpark imports (available on every Serverless cluster)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;from&lt;/SPAN&gt;&lt;SPAN&gt; pyspark.sql &lt;/SPAN&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; DataFrame &lt;/SPAN&gt;&lt;SPAN&gt;as&lt;/SPAN&gt;&lt;SPAN&gt; SparkDataFrame&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# Helper: accept either a Spark or pandas DataFrame&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;_to_pandas&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;df&lt;/SPAN&gt;&lt;SPAN&gt;: Union[pd.DataFrame, SparkDataFrame]) -&amp;gt; pd.DataFrame:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"""&lt;/SPAN&gt;&lt;SPAN&gt;Convert Spark DataFrame → pandas if needed.&lt;/SPAN&gt;&lt;SPAN&gt;"""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;isinstance&lt;/SPAN&gt;&lt;SPAN&gt;(df, SparkDataFrame):&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt; df.&lt;/SPAN&gt;&lt;SPAN&gt;toPandas&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt; df.&lt;/SPAN&gt;&lt;SPAN&gt;copy&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# Main function&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;taylor_row_percent&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;df&lt;/SPAN&gt;&lt;SPAN&gt;: Union[pd.DataFrame, SparkDataFrame],&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;row_domains&lt;/SPAN&gt;&lt;SPAN&gt;: List[&lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;],&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;measure&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;exclusions&lt;/SPAN&gt;&lt;SPAN&gt;: Dict[&lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;, List],&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;weight_col&lt;/SPAN&gt;&lt;SPAN&gt;: Optional[&lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;strata_col&lt;/SPAN&gt;&lt;SPAN&gt;: Optional[&lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;fpc&lt;/SPAN&gt;&lt;SPAN&gt;: Optional[Union[Dict[&lt;/SPAN&gt;&lt;SPAN&gt;object&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;float&lt;/SPAN&gt;&lt;SPAN&gt;], &lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;]] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;p_01_adjust&lt;/SPAN&gt;&lt;SPAN&gt;: Optional[&lt;/SPAN&gt;&lt;SPAN&gt;float&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;1.0&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;zcritical&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;float&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;1.96&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;output_spark&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;bool&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;False&lt;/SPAN&gt;&lt;SPAN&gt;, &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;# set True to get a Spark DataFrame back&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;output_delta_table&lt;/SPAN&gt;&lt;SPAN&gt;: Optional[&lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN&gt;, &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;# e.g. "catalog.schema.table"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;) -&amp;gt; Union[pd.DataFrame, SparkDataFrame]:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; Taylor-linearisation row-percent estimator.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; Parameters&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; ----------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df : pandas or Spark DataFrame&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; row_domains : columns that define the row grouping&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; measure : column whose value distribution is estimated&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; exclusions : {col: [values_to_exclude]} &amp;nbsp;– NaN supported&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; weight_col : survey weight column (None → unweighted)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; strata_col : stratification column (None → single stratum)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; fpc : finite-population correction – dict {stratum: N} or column name&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; p_01_adjust : Agresti-Coull scaling factor for p=0/1 cells&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; zcritical : z-value for confidence interval (default 1.96)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; output_spark : if True, return a Spark DataFrame instead of pandas&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; output_delta_table : if set, write result to this Delta table and return it&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 0. Normalise input&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;_to_pandas&lt;/SPAN&gt;&lt;SPAN&gt;(df)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 1. Weight handling&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; weight_col &lt;/SPAN&gt;&lt;SPAN&gt;is&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df[&lt;/SPAN&gt;&lt;SPAN&gt;"_w"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;1.0&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; weight_col &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"_w"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 2. Strata handling&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; strata_col &lt;/SPAN&gt;&lt;SPAN&gt;is&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df[&lt;/SPAN&gt;&lt;SPAN&gt;"_strata"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"all"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; strata_col &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"_strata"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; fpc &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 3. FPC handling&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; fpc &lt;/SPAN&gt;&lt;SPAN&gt;is&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df[&lt;/SPAN&gt;&lt;SPAN&gt;"_N_h"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; np.inf&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; fpc_col &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"_N_h"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;elif&lt;/SPAN&gt; &lt;SPAN&gt;isinstance&lt;/SPAN&gt;&lt;SPAN&gt;(fpc, &lt;/SPAN&gt;&lt;SPAN&gt;dict&lt;/SPAN&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df[&lt;/SPAN&gt;&lt;SPAN&gt;"_N_h"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df[strata_col].&lt;/SPAN&gt;&lt;SPAN&gt;map&lt;/SPAN&gt;&lt;SPAN&gt;(fpc)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; fpc_col &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"_N_h"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; fpc_col &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; fpc&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 4. Scope flag &amp;nbsp;(mirrors SAS exclusion logic)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df[&lt;/SPAN&gt;&lt;SPAN&gt;"scope_fg"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;1&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; var, excl_vals &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; exclusions.&lt;/SPAN&gt;&lt;SPAN&gt;items&lt;/SPAN&gt;&lt;SPAN&gt;():&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; val &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; excl_vals:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; pd.&lt;/SPAN&gt;&lt;SPAN&gt;isna&lt;/SPAN&gt;&lt;SPAN&gt;(val):&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df.loc[df[var].&lt;/SPAN&gt;&lt;SPAN&gt;isna&lt;/SPAN&gt;&lt;SPAN&gt;(), &lt;/SPAN&gt;&lt;SPAN&gt;"scope_fg"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df.loc[df[var] &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt;&lt;SPAN&gt; val, &lt;/SPAN&gt;&lt;SPAN&gt;"scope_fg"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 5. Domain &amp;amp; measure levels &amp;nbsp;(in-scope records only)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; domain_levels &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; {&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; c: df.loc[df[&lt;/SPAN&gt;&lt;SPAN&gt;"scope_fg"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt; &lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;, c].&lt;/SPAN&gt;&lt;SPAN&gt;dropna&lt;/SPAN&gt;&lt;SPAN&gt;().&lt;/SPAN&gt;&lt;SPAN&gt;unique&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; c &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; row_domains&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; }&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; measure_levels &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df.loc[df[&lt;/SPAN&gt;&lt;SPAN&gt;"scope_fg"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt; &lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;, measure].&lt;/SPAN&gt;&lt;SPAN&gt;dropna&lt;/SPAN&gt;&lt;SPAN&gt;().&lt;/SPAN&gt;&lt;SPAN&gt;unique&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; all_rows &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;list&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;product&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;domain_levels.&lt;/SPAN&gt;&lt;SPAN&gt;values&lt;/SPAN&gt;&lt;SPAN&gt;(), measure_levels))&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; results &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; []&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 6. Main estimation loop&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; combo &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; all_rows:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; dom_vals &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; combo[:&lt;/SPAN&gt;&lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;]&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; measure_val &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; combo[&lt;/SPAN&gt;&lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;]&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# Domain mask (scope == 1 only)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; mask_row &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df[&lt;/SPAN&gt;&lt;SPAN&gt;"scope_fg"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt; &lt;SPAN&gt;1&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; c, v &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt; &lt;SPAN&gt;zip&lt;/SPAN&gt;&lt;SPAN&gt;(row_domains, dom_vals):&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; mask_row &lt;/SPAN&gt;&lt;SPAN&gt;&amp;amp;=&lt;/SPAN&gt;&lt;SPAN&gt; (df[c] &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt;&lt;SPAN&gt; v)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; mask_cell &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; mask_row &lt;/SPAN&gt;&lt;SPAN&gt;&amp;amp;&lt;/SPAN&gt;&lt;SPAN&gt; (df[measure] &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt;&lt;SPAN&gt; measure_val)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; n_row &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;int&lt;/SPAN&gt;&lt;SPAN&gt;(mask_row.&lt;/SPAN&gt;&lt;SPAN&gt;sum&lt;/SPAN&gt;&lt;SPAN&gt;())&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; n_cell &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;int&lt;/SPAN&gt;&lt;SPAN&gt;(mask_cell.&lt;/SPAN&gt;&lt;SPAN&gt;sum&lt;/SPAN&gt;&lt;SPAN&gt;())&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ---- domain empty ----------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; n_row &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; p_hat &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; se_p &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; W_row &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df.loc[mask_row, &amp;nbsp;weight_col].&lt;/SPAN&gt;&lt;SPAN&gt;sum&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; W_cell &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df.loc[mask_cell, weight_col].&lt;/SPAN&gt;&lt;SPAN&gt;sum&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; p_hat &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; W_cell &lt;/SPAN&gt;&lt;SPAN&gt;/&lt;/SPAN&gt;&lt;SPAN&gt; W_row &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; W_row &lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt; &lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# Taylor linearisation&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df2 &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df.&lt;/SPAN&gt;&lt;SPAN&gt;copy&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df2[&lt;/SPAN&gt;&lt;SPAN&gt;"R"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; mask_row.&lt;/SPAN&gt;&lt;SPAN&gt;astype&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;float&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df2[&lt;/SPAN&gt;&lt;SPAN&gt;"C"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; mask_cell.&lt;/SPAN&gt;&lt;SPAN&gt;astype&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;float&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df2[&lt;/SPAN&gt;&lt;SPAN&gt;"u"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; (df2[weight_col] &lt;/SPAN&gt;&lt;SPAN&gt;/&lt;/SPAN&gt;&lt;SPAN&gt; W_row) &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df2[&lt;/SPAN&gt;&lt;SPAN&gt;"C"&lt;/SPAN&gt;&lt;SPAN&gt;] &lt;/SPAN&gt;&lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt; p_hat &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt; df2[&lt;/SPAN&gt;&lt;SPAN&gt;"R"&lt;/SPAN&gt;&lt;SPAN&gt;]&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; )&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; var_p &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;0.0&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; h, g &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; df2.&lt;/SPAN&gt;&lt;SPAN&gt;groupby&lt;/SPAN&gt;&lt;SPAN&gt;(strata_col):&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; n_h &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;len&lt;/SPAN&gt;&lt;SPAN&gt;(g)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; n_h &lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;=&lt;/SPAN&gt; &lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;continue&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; N_h &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; g[fpc_col].iloc[&lt;/SPAN&gt;&lt;SPAN&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;]&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; f_h &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; n_h &lt;/SPAN&gt;&lt;SPAN&gt;/&lt;/SPAN&gt;&lt;SPAN&gt; N_h &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isfinite&lt;/SPAN&gt;&lt;SPAN&gt;(N_h) &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt; &lt;SPAN&gt;0.0&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; S2 &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; g[&lt;/SPAN&gt;&lt;SPAN&gt;"u"&lt;/SPAN&gt;&lt;SPAN&gt;].&lt;/SPAN&gt;&lt;SPAN&gt;var&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;ddof&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; var_p &lt;/SPAN&gt;&lt;SPAN&gt;+=&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt; &lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt; f_h) &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt; n_h &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt; S2&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; se_p &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;sqrt&lt;/SPAN&gt;&lt;SPAN&gt;(var_p) &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; var_p &lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt; &lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ---- Agresti-Coull &amp;nbsp;(SAS exact logic) --------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; adj_p &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; p_hat&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; adj_se &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; se_p&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; p_01_adjust &lt;/SPAN&gt;&lt;SPAN&gt;is&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;and&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(p_hat)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;and&lt;/SPAN&gt;&lt;SPAN&gt; p_hat &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;SPAN&gt;0.0&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;1.0&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;and&lt;/SPAN&gt;&lt;SPAN&gt; n_row &lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; adj_p &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; (n_cell &lt;/SPAN&gt;&lt;SPAN&gt;+&lt;/SPAN&gt; &lt;SPAN&gt;2&lt;/SPAN&gt;&lt;SPAN&gt;) &lt;/SPAN&gt;&lt;SPAN&gt;/&lt;/SPAN&gt;&lt;SPAN&gt; (n_row &lt;/SPAN&gt;&lt;SPAN&gt;+&lt;/SPAN&gt; &lt;SPAN&gt;4&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; adj_se &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;sqrt&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; adj_p &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt; &lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt; adj_p) &lt;/SPAN&gt;&lt;SPAN&gt;/&lt;/SPAN&gt;&lt;SPAN&gt; (n_row &lt;/SPAN&gt;&lt;SPAN&gt;+&lt;/SPAN&gt; &lt;SPAN&gt;4&lt;/SPAN&gt;&lt;SPAN&gt;) &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt; p_01_adjust&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; )&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ---- Final statistics ------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; rowpercent &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; p_hat &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt; &lt;SPAN&gt;100&lt;/SPAN&gt; &lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(p_hat) &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; rowstderr &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; se_p &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt; &lt;SPAN&gt;100&lt;/SPAN&gt; &lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(se_p) &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; adj_rowpercent &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; adj_p &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt; &lt;SPAN&gt;100&lt;/SPAN&gt; &lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(adj_p) &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; adj_rowstderr &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; adj_se &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt; &lt;SPAN&gt;100&lt;/SPAN&gt; &lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(adj_se) &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; moe &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; zcritical &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt; adj_rowstderr &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(adj_rowstderr) &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; lower &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; rowpercent &lt;/SPAN&gt;&lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt; moe &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(moe) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; upper &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; rowpercent &lt;/SPAN&gt;&lt;SPAN&gt;+&lt;/SPAN&gt;&lt;SPAN&gt; moe &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(moe) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; rse &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (adj_rowstderr &lt;/SPAN&gt;&lt;SPAN&gt;/&lt;/SPAN&gt;&lt;SPAN&gt; adj_rowpercent) &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt; &lt;SPAN&gt;100&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; adj_rowpercent &lt;/SPAN&gt;&lt;SPAN&gt;not&lt;/SPAN&gt; &lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;SPAN&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;, np.nan) &lt;/SPAN&gt;&lt;SPAN&gt;and&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(adj_rowpercent)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt;&lt;SPAN&gt; np.nan&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; )&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ---- Formatting &amp;nbsp;(SAS style) -----------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; percent_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;rowpercent&lt;/SPAN&gt;&lt;SPAN&gt;:4.1f}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt; &lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(rowpercent) &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt; &lt;SPAN&gt;""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; moe_new &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;moe&lt;/SPAN&gt;&lt;SPAN&gt;:4.1f}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(moe) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt; &lt;SPAN&gt;""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; lower_new &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;max&lt;/SPAN&gt;&lt;SPAN&gt;(lower, &lt;/SPAN&gt;&lt;SPAN&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;SPAN&gt;:4.1f}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(lower) &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt; &lt;SPAN&gt;""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; upper_new &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;min&lt;/SPAN&gt;&lt;SPAN&gt;(upper, &lt;/SPAN&gt;&lt;SPAN&gt;100&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;SPAN&gt;:4.1f}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt; &lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(upper) &lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt; &lt;SPAN&gt;""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; rse_new &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;rse&lt;/SPAN&gt;&lt;SPAN&gt;:4.1f}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(rse) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;else&lt;/SPAN&gt; &lt;SPAN&gt;""&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# Star rule&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;not&lt;/SPAN&gt;&lt;SPAN&gt; np.&lt;/SPAN&gt;&lt;SPAN&gt;isnan&lt;/SPAN&gt;&lt;SPAN&gt;(moe) &lt;/SPAN&gt;&lt;SPAN&gt;and&lt;/SPAN&gt;&lt;SPAN&gt; moe &lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;=&lt;/SPAN&gt; &lt;SPAN&gt;10&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; percent_new &lt;/SPAN&gt;&lt;SPAN&gt;+=&lt;/SPAN&gt; &lt;SPAN&gt;"*"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ---- Suppression &amp;nbsp;(SOS rules) ----------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; n_row &lt;/SPAN&gt;&lt;SPAN&gt;==&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; percent_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; moe_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; lower_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; upper_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; rse_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"na"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;elif&lt;/SPAN&gt;&lt;SPAN&gt; n_row &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt; &lt;SPAN&gt;range&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;6&lt;/SPAN&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; percent_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; moe_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; lower_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; upper_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; rse_new &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"np"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ---- Store row -------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; row &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;dict&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;zip&lt;/SPAN&gt;&lt;SPAN&gt;(row_domains, dom_vals))&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; row[measure] &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; measure_val&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; row.&lt;/SPAN&gt;&lt;SPAN&gt;update&lt;/SPAN&gt;&lt;SPAN&gt;({&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"domain_frequency"&lt;/SPAN&gt;&lt;SPAN&gt;: n_row,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"Frequency"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;n_cell,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"RowPercent"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; rowpercent,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"RowStdErr"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;rowstderr,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"adj_rowpercent"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; adj_rowpercent,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"adj_rowstderr"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp;adj_rowstderr,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"moe"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;moe,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"lower"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;lower,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"upper"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;upper,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"rse"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;rse,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"percent_new"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp;percent_new,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"moe_new"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;moe_new,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"lower_new"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;lower_new,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"upper_new"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;upper_new,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"rse_new"&lt;/SPAN&gt;&lt;SPAN&gt;: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;rse_new,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; })&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; results.&lt;/SPAN&gt;&lt;SPAN&gt;append&lt;/SPAN&gt;&lt;SPAN&gt;(row)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# 7. Build output&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# ------------------------------------------------------------------&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; result_pdf &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; pd.&lt;/SPAN&gt;&lt;SPAN&gt;DataFrame&lt;/SPAN&gt;&lt;SPAN&gt;(results)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# Optional: write to Delta table&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; output_delta_table:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; spark.&lt;/SPAN&gt;&lt;SPAN&gt;createDataFrame&lt;/SPAN&gt;&lt;SPAN&gt;(result_pdf) &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# noqa: F821 &amp;nbsp;# spark injected by Databricks&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;.write&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;.&lt;/SPAN&gt;&lt;SPAN&gt;format&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"delta"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;.&lt;/SPAN&gt;&lt;SPAN&gt;mode&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"overwrite"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;.&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"overwriteSchema"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"true"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;.&lt;/SPAN&gt;&lt;SPAN&gt;saveAsTable&lt;/SPAN&gt;&lt;SPAN&gt;(output_delta_table)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; )&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;print&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"Results written to Delta table: &lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;output_delta_table&lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# Optional: return Spark DataFrame&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; output_spark &lt;/SPAN&gt;&lt;SPAN&gt;or&lt;/SPAN&gt;&lt;SPAN&gt; output_delta_table:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt; spark.&lt;/SPAN&gt;&lt;SPAN&gt;createDataFrame&lt;/SPAN&gt;&lt;SPAN&gt;(result_pdf) &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;# noqa: F821&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt; result_pdf&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 04 Mar 2026 00:30:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/optimizing-my-databricks-code/m-p/149743#M53169</guid>
      <dc:creator>Jake3</dc:creator>
      <dc:date>2026-03-04T00:30:59Z</dc:date>
    </item>
    <item>
      <title>Re: optimizing my databricks code</title>
      <link>https://community.databricks.com/t5/data-engineering/optimizing-my-databricks-code/m-p/149788#M53181</link>
      <description>&lt;P&gt;Greetings&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/218516"&gt;@Jake3&lt;/a&gt;&amp;nbsp;,&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;Most of the runtime here is coming from two things:&lt;/P&gt;
&lt;OL start="1"&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;You’re pulling everything into single-node pandas via toPandas().&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;You’re re-doing a lot of per-combination work inside the big for combo in all_rows loop, including a full df.copy() and repeated groupbys.&lt;/P&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P class="p1"&gt;Below are changes that keep the same estimator but reduce work. I’m going to separate “safe” optimizations (no change in logic) from “aggressive” ones (can slightly change displayed output or shape).&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL start="1"&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;Push as much as possible to Spark before toPandas()&lt;/P&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P class="p1"&gt;If df is a Spark DataFrame and the input table is large, make sure you shrink it before calling this function:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;# Example wrapper before calling taylor_row_percent

needed_cols = set(row_domains + [measure] + list(exclusions.keys()))
if weight_col:
    needed_cols.add(weight_col)
if strata_col:
    needed_cols.add(strata_col)
if isinstance(fpc, str):
    needed_cols.add(fpc)

sdf_small = sdf.select(*needed_cols)
result = taylor_row_percent(
    sdf_small, row_domains, measure, exclusions,
    weight_col=weight_col, strata_col=strata_col, fpc=fpc, ...
)&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P class="p1"&gt;This keeps the math identical, but reduces bytes transferred and processed in pandas.&lt;/P&gt;
&lt;OL start="2"&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;Don’t copy the whole DataFrame for every combo&lt;/P&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P class="p1"&gt;Inside the main loop you do:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;df2 = df.copy()
df2["R"] = mask_row.astype(float)
df2["C"] = mask_cell.astype(float)
df2["u"] = (df2[weight_col] / W_row) * (df2["C"] - p_hat * df2["R"])

for h, g in df2.groupby(strata_col):
    ...&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P class="p1"&gt;That df.copy() per combination is extremely expensive.&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;You can reuse the same DataFrame and only overwrite temporary columns; that keeps the math and the order of operations the same:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;# Prepare once, after step 3 (FPC handling)
df["R"] = 0.0
df["C"] = 0.0
df["u"] = 0.0

...

for combo in all_rows:
    dom_vals    = combo[:-1]
    measure_val = combo[-1]

    mask_row = df["scope_fg"] == 1
    for c, v in zip(row_domains, dom_vals):
        mask_row &amp;amp;= (df[c] == v)

    mask_cell = mask_row &amp;amp; (df[measure] == measure_val)

    n_row  = int(mask_row.sum())
    n_cell = int(mask_cell.sum())

    if n_row == 0:
        ...
    else:
        W_row  = df.loc[mask_row,  weight_col].sum()
        W_cell = df.loc[mask_cell, weight_col].sum()
        p_hat  = W_cell / W_row if W_row &amp;gt; 0 else np.nan

        # overwrite temporary columns in-place
        df["R"] = mask_row.astype(float)
        df["C"] = mask_cell.astype(float)
        df["u"] = (df[weight_col] / W_row) * (df["C"] - p_hat * df["R"])

        var_p = 0.0
        for h, g in df.groupby(strata_col):
            n_h = len(g)
            if n_h &amp;lt;= 1:
                continue
            N_h = g[fpc_col].iloc[0]
            f_h = n_h / N_h if np.isfinite(N_h) else 0.0
            S2  = g["u"].var(ddof=1)
            var_p += (1 - f_h) * n_h * S2

        se_p = np.sqrt(var_p) if var_p &amp;gt; 0 else np.nan&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P class="p1"&gt;Key points:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;We don’t change how u is computed or how variance is aggregated.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;We reuse df and group on it directly.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;You should get the same standard errors (modulo normal floating-point noise), but much faster and with less memory churn.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL start="3"&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;Drop out-of-scope records early (still same results)&lt;/P&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P class="p1"&gt;Right now you carry all rows and always AND with scope_fg == 1. Once scope_fg is fully computed, you can safely filter to in-scope records exactly once:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;# 4. Scope flag  (same as your code up to this point)
df["scope_fg"] = 1
for var, excl_vals in exclusions.items():
    for val in excl_vals:
        if pd.isna(val):
            df.loc[df[var].isna(), "scope_fg"] = 0
        else:
            df.loc[df[var] == val, "scope_fg"] = 0

# NEW: drop out-of-scope rows once
df = df[df["scope_fg"] == 1].copy()
df.drop(columns=["scope_fg"], inplace=True)&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P class="p1"&gt;Then, later:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;domain_levels and measure_levels effectively used the same filter already (scope_fg == 1).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;n_row, n_cell and all weights are computed only on these in-scope units anyway.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P class="p1"&gt;You’ll need to drop explicit references to scope_fg in the masks:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;# before
mask_row = df["scope_fg"] == 1
for c, v in zip(row_domains, dom_vals):
    mask_row &amp;amp;= (df[c] == v)

# after filtering to in-scope only:
mask_row = pd.Series(True, index=df.index)
for c, v in zip(row_domains, dom_vals):
    mask_row &amp;amp;= (df[c] == v)&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P class="p1"&gt;This reduces row count for all subsequent operations with no change in the estimator definition.&lt;/P&gt;
&lt;P class="p1"&gt;4. Reduce width: keep only needed columns in pandas&lt;/P&gt;
&lt;P class="p1"&gt;After you’ve handled weight_col, strata_col, fpc_col, and scope_fg, you can trim columns:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;keep_cols = set(row_domains + [measure, weight_col, strata_col, fpc_col])
keep_cols |= set(exclusions.keys())

df = df[list(keep_cols)].copy()&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P class="p1"&gt;This makes copies and groupbys cheaper without touching the math.&lt;/P&gt;
&lt;OL start="5"&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;Why your SEs “change slightly” when you refactor&lt;/P&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P class="p1"&gt;Whenever you:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;reorder operations,&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;change from groupby().var() to a custom variance,&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;or change which rows participate in an intermediate computation,&lt;SPAN&gt;you can get small floating-point differences even when the algebra is identical. On survey SEs those show up as tiny differences in displayed standard error and margins of error.&lt;/SPAN&gt;&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P class="p1"&gt;If you need exact reproduction, keep:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;the same sample (df),&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;the same grouping logic,&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;the same ddof and variance method (.var(ddof=1)),&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;the same order of operations (especially how you aggregate across strata).&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P class="p3"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;The safe steps above (Spark pre-filter, dropping out-of-scope once, reusing df instead of df.copy() in the loop) preserve that.&lt;/P&gt;
&lt;OL start="6"&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;More aggressive speedups (may change the output shape)&lt;/P&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P class="p1"&gt;Only if you’re willing to change what rows you output, there are big speedups available:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;Instead of product(*domain_levels.values(), measure_levels), drive the loop from the actual observed combinations using groupby(row_domains + [measure]).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;That avoids computing rows for combinations that don’t appear in the data (which currently get na/np).&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P class="p1"&gt;This can be dramatically faster for high-cardinality domains, but your output will no longer include all “zero-cell” combinations.&lt;/P&gt;
&lt;P class="p1"&gt;If you paste how big your typical data is (rows, distinct strata, distinct domain levels), I can propose a tighter rewrite that keeps the estimator identical but gets you closer to “linear in N” rather than “N × #cells”.&lt;/P&gt;
&lt;P class="p3"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;Hope this helps, Louis.&lt;/P&gt;</description>
      <pubDate>Wed, 04 Mar 2026 17:39:48 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/optimizing-my-databricks-code/m-p/149788#M53181</guid>
      <dc:creator>Louis_Frolio</dc:creator>
      <dc:date>2026-03-04T17:39:48Z</dc:date>
    </item>
    <item>
      <title>Re: optimizing my databricks code</title>
      <link>https://community.databricks.com/t5/data-engineering/optimizing-my-databricks-code/m-p/149853#M53184</link>
      <description>&lt;P&gt;I would definitely agree with Louis on this. He provided a some great suggestions even from the code/logic perspective.&lt;/P&gt;&lt;P&gt;Couple of things I would suggest while developing logic.&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. As Louis mentioned, when we use Databricks always think about Spark(Distributed compute framework), instead of Pandas. Transform your data as needed, filter out only necessary data to run your model using Taylor.&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. Based on the data size, use Spark UI to check the logs and utilisation of your memory, CPU, storage this helps if you need to change your compute type. Since your whole logic is written in Pandas it's likely to be that only your Driver node is working. So, use standalone cluster instead of all purpose cluster (if you want to keep everything in Pandas). If you observe your CPU or memory utilisation 100%, then try increase the capacity of the node. I would suggest use memory optimized nodes.&lt;/P&gt;&lt;P&gt;3. As Louis mentioned divide your data into chunks and then run your model instead of feeding all the data at once.&lt;/P&gt;&lt;P&gt;Hope this helps.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Saicharan&lt;/P&gt;</description>
      <pubDate>Wed, 04 Mar 2026 22:39:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/optimizing-my-databricks-code/m-p/149853#M53184</guid>
      <dc:creator>Shetty_1338</dc:creator>
      <dc:date>2026-03-04T22:39:46Z</dc:date>
    </item>
    <item>
      <title>Hi @Jake3, Your Taylor-linearisation row-percent estimato...</title>
      <link>https://community.databricks.com/t5/data-engineering/optimizing-my-databricks-code/m-p/150356#M53391</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/218516"&gt;@Jake3&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;Your Taylor-linearisation row-percent estimator is well structured. The main performance bottleneck is the Python-level loop over every domain/measure combination, with a full DataFrame copy (df.copy()) happening inside each iteration. Here are concrete changes that should give you a significant speedup without altering the statistical results.&lt;/P&gt;
&lt;P&gt;OPTIMIZATION 1: REMOVE THE df.copy() INSIDE THE LOOP&lt;/P&gt;
&lt;P&gt;The line df2 = df.copy() creates an entire copy of your DataFrame for every single domain/measure combination. That is the single most expensive operation in your code. Instead, compute the "u" column directly on sliced views and avoid the copy entirely:&lt;/P&gt;
&lt;PRE&gt;# Replace the df2 block with direct computation:
w_arr = df.loc[mask_row, weight_col].values
c_arr = mask_cell[mask_row].astype(float).values
r_arr = np.ones(n_row)
u_arr = (w_arr / W_row) * (c_arr - p_hat * r_arr)&lt;/PRE&gt;
&lt;P&gt;Then in the strata variance loop, group by strata within the mask_row subset rather than copying the entire DataFrame.&lt;/P&gt;
&lt;P&gt;OPTIMIZATION 2: USE groupby INSTEAD OF LOOPING OVER COMBINATIONS&lt;/P&gt;
&lt;P&gt;Instead of building all_rows via itertools.product and looping, use pandas groupby to iterate only over combinations that actually exist in your data:&lt;/P&gt;
&lt;PRE&gt;group_cols = row_domains + [measure]
grouped = df.loc[df["scope_fg"] == 1].groupby(group_cols, observed=True)&lt;/PRE&gt;
&lt;P&gt;This automatically skips empty combinations and avoids repeated Boolean mask construction for each combo. For the domain totals (W_row), pre-compute them with a separate groupby on just the row_domains:&lt;/P&gt;
&lt;PRE&gt;domain_totals = (
  df.loc[df["scope_fg"] == 1]
    .groupby(row_domains, observed=True)[weight_col]
    .sum()
)&lt;/PRE&gt;
&lt;P&gt;Then look up the domain total for each group with a simple index lookup.&lt;/P&gt;
&lt;P&gt;OPTIMIZATION 3: VECTORIZE THE STRATA VARIANCE CALCULATION&lt;/P&gt;
&lt;P&gt;The inner loop over strata can be vectorized using groupby + transform:&lt;/P&gt;
&lt;PRE&gt;# Pre-compute per-stratum stats in one pass
strata_groups = sub_df.groupby(strata_col)
n_h_map = strata_groups["u"].transform("count")
mean_u = strata_groups["u"].transform("mean")

# Variance contribution per record
sub_df["u_dev_sq"] = (sub_df["u"] - mean_u) ** 2

strata_stats = sub_df.groupby(strata_col).agg(
  n_h=("u", "count"),
  S2=("u_dev_sq", lambda x: x.sum() / (len(x) - 1) if len(x) &amp;gt; 1 else 0),
  N_h=(fpc_col, "first")
)
strata_stats["f_h"] = np.where(
  np.isfinite(strata_stats["N_h"]),
  strata_stats["n_h"] / strata_stats["N_h"],
  0.0
)
var_p = ((1 - strata_stats["f_h"]) * strata_stats["n_h"] * strata_stats["S2"]).sum()&lt;/PRE&gt;
&lt;P&gt;OPTIMIZATION 4: PRE-COMPUTE MASKS WITH CATEGORICAL COLUMNS&lt;/P&gt;
&lt;P&gt;Convert your domain and measure columns to pandas Categorical dtype before processing. This makes groupby and equality comparisons faster, especially with string columns:&lt;/P&gt;
&lt;PRE&gt;for c in row_domains + [measure]:
  df[c] = df[c].astype("category")&lt;/PRE&gt;
&lt;P&gt;OPTIMIZATION 5: AVOID .toPandas() ON LARGE DATASETS&lt;/P&gt;
&lt;P&gt;If your Spark DataFrame is very large, the .toPandas() call collects everything to the driver. Consider whether you can filter or aggregate in Spark first. For example, if you only need in-scope records:&lt;/P&gt;
&lt;PRE&gt;# Filter in Spark before collecting
spark_df = spark_df.filter(~col("some_col").isin(exclusion_values))
df = spark_df.toPandas()&lt;/PRE&gt;
&lt;P&gt;You can also enable Apache Arrow for faster Spark-to-pandas conversion:&lt;/P&gt;
&lt;PRE&gt;spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "true")&lt;/PRE&gt;
&lt;P&gt;This is typically enabled by default on serverless, but worth confirming on other compute types.&lt;/P&gt;
&lt;P&gt;PUTTING IT ALL TOGETHER&lt;/P&gt;
&lt;P&gt;Here is a refactored version of the main estimation loop that incorporates optimizations 1-3:&lt;/P&gt;
&lt;PRE&gt;# Pre-compute domain-level weighted totals
inscope = df[df["scope_fg"] == 1].copy()
for c in row_domains + [measure]:
  inscope[c] = inscope[c].astype("category")

domain_W = inscope.groupby(row_domains, observed=True)[weight_col].sum()
domain_N = inscope.groupby(row_domains, observed=True)[weight_col].count()

group_cols = row_domains + [measure]
cell_agg = inscope.groupby(group_cols, observed=True).agg(
  W_cell=(weight_col, "sum"),
  n_cell=(weight_col, "count")
).reset_index()

# Merge domain totals
cell_agg = cell_agg.merge(
  domain_W.rename("W_row").reset_index(),
  on=row_domains, how="left"
)
cell_agg = cell_agg.merge(
  domain_N.rename("n_row").reset_index(),
  on=row_domains, how="left"
)
cell_agg["p_hat"] = cell_agg["W_cell"] / cell_agg["W_row"]

# Then loop only for Taylor variance (which needs record-level data)
# but now you skip empty combos and avoid df.copy()&lt;/PRE&gt;
&lt;P&gt;This approach computes the point estimates (p_hat) in one vectorized pass. You only need the record-level loop for the Taylor variance computation, which still needs individual "u" values per stratum.&lt;/P&gt;
&lt;P&gt;A NOTE ON NUMERICAL PRECISION&lt;/P&gt;
&lt;P&gt;You mentioned that standard errors change slightly when you try to optimize. This is almost always caused by floating-point summation order differences. If you switch from a Python loop to vectorized operations, the order in which values are summed may change slightly. To verify correctness:&lt;/P&gt;
&lt;P&gt;1. Compare results on a small test dataset where you can verify by hand.&lt;BR /&gt;
2. Check that differences are within floating-point tolerance (typically 1e-10 or smaller).&lt;BR /&gt;
3. If you need exact reproducibility with SAS output, keep the loop-based variance computation but apply the other optimizations (removing df.copy(), pre-computing domain totals).&lt;/P&gt;
&lt;P&gt;REFERENCE DOCUMENTATION&lt;/P&gt;
&lt;P&gt;- Apache Arrow optimization for pandas conversion: &lt;A href="https://docs.databricks.com/en/pandas/pandas-function-apis.html" target="_blank"&gt;https://docs.databricks.com/en/pandas/pandas-function-apis.html&lt;/A&gt;&lt;BR /&gt;
- Performance tuning on Databricks: &lt;A href="https://docs.databricks.com/en/optimizations/index.html" target="_blank"&gt;https://docs.databricks.com/en/optimizations/index.html&lt;/A&gt;&lt;BR /&gt;
- pandas best practices for performance: &lt;A href="https://pandas.pydata.org/docs/user_guide/enhancingperf.html" target="_blank"&gt;https://pandas.pydata.org/docs/user_guide/enhancingperf.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.&lt;/P&gt;
&lt;P&gt;If this answer resolves your question, could you mark it as "Accept as Solution"? That helps other users quickly find the correct fix.&lt;/P&gt;</description>
      <pubDate>Mon, 09 Mar 2026 05:59:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/optimizing-my-databricks-code/m-p/150356#M53391</guid>
      <dc:creator>SteveOstrowski</dc:creator>
      <dc:date>2026-03-09T05:59:02Z</dc:date>
    </item>
    <item>
      <title>Re: optimizing my databricks code</title>
      <link>https://community.databricks.com/t5/data-engineering/optimizing-my-databricks-code/m-p/151101#M53581</link>
      <description>&lt;P&gt;Thank you!&lt;/P&gt;</description>
      <pubDate>Tue, 17 Mar 2026 01:11:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/optimizing-my-databricks-code/m-p/151101#M53581</guid>
      <dc:creator>Jake3</dc:creator>
      <dc:date>2026-03-17T01:11:22Z</dc:date>
    </item>
  </channel>
</rss>

