Databricks Community

jeremy98 · ‎09-11-2025

Hi community,

My team and I are facing an issue with the Parameter Values (see the title of this discussion) being passed through each task of a job. Unfortunately, this causes our job run to fail.

Do you have any suggestions on how to handle parameters that are too large during execution?

I was considering storing the JSON file in a volume and deleting it afterward, but perhaps you know of a better solution.

Kind regards,

Advika · ‎09-11-2025

Hello @jeremy98!

Yes, agree with your workaround. Persist the data in a file and pass the path instead of the full JSON to avoid parameter size issues.

szymon_dybczak · ‎09-11-2025

We're dealing with this issue on our project in following way:

- we have defined config JSON file (could also be YAML - doesn't matter)

- and now let's say that you have param that has really long value - for the sake of example let's consider parameters related to tables that we want to load

So, we can definde our json config

config = [
{
    "really_important_table": { 
        "table_name": "some_table_name", 
        "source_file_format": "json", 
        "data_lake_target_folder_name": "sample_target", 
        "data_source_path": "source_path", 
        "transform_function_name": 
            "function_name", 
            "autoloader_options": { 
                "cloudFiles.resourceGroup": "rg_name" 
            }, 
        "clean_bronze": False 
    }
},
{
        "table2": { 
        "table_name": "some_table_name2", 
        "source_file_format": "json", 
        "data_lake_target_folder_name": "folder_name", 
        "data_source_path": "src_path", 
        "transform_function_name": "transform_function", 
        "autoloader_options": { 
            "cloudFiles.resourceGroup": "rg_2" 
        }, 
        "clean_bronze": False 
    }     
}    
]

Now you need to define python module that will read the content of this config file and will return config based on a provided key.

So for example, let's say you need to process really_important_table config. Then in your workflow you need to just pass really_important_table key and in your notebook/code use your module to get you a proper value associated with this key.

Databricks Community

Run failed with error message Failed to resolve references: Parameter values exceed the size limit

Join Us as a Local Community Builder!

🌟 Community Pulse: Your Weekly Roundup! November 14 – 20, 2025

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples

Portland Data + AI Meetup — Holiday Event - Wednesday, December 3rd