cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

I am getting NoneType error when running a query from API on cluster

Nastia
New Contributor III

When I am running a query on Databricks itself from notebook, it is running fine and giving me results. But the same query when executed from FastAPI (Python, using databricks library) is giving me "TypeError: 'NoneType' object is not iterable".

I can see both queries (from notebook and from API) from SparkUI list of jobs and both of them are marked as SUCCESS.

Differences between two jobs: 
API one doesn't have any stdout logs (giving error "The requested Spark UI page does not exist."), whilst notebook one has logs.

API one has 3 stages in DAG visualization, whilst notebook one has 2 stages (additional stage is mapPartitionsInternal)

API is sending a query to "thriftsever-session-..." pool)

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz_Fatma
Community Manager
Community Manager

Hi @Nastia, The “TypeError: ‘NoneType’ object is not iterable” error typically occurs when you try to iterate over a variable that has a value of None.

Let’s explore some possible solutions to address this issue:

  1. Check for None before Iterating: Ensure that the variable you’re trying to iterate over is not None. You can add a check like this before the loop:

    if my_list is not None:
        for item in my_list:
            # Your code logic here
    
  2. Verify the Query Execution: Since both queries are marked as SUCCESS in SparkUI, the issue might not be with the query execution itself. However, it’s essential to verify that the query execution is indeed successful when called from FastAPI.

  3. Investigate the API Execution: Since the API query doesn’t have stdout logs, it’s challenging to diagnose the issue directly. Here are some steps you can take to investigate further:

    • Check if the API call is correctly passing the query to Databricks.
    • Verify that the API call is using the same configuration (e.g., cluster settings, authentication) as the notebook execution.
    • Inspect any relevant logs or error messages specific to the API execution.
  4. Pool Configuration: The fact that the API query is sending to the “thriftsever-session-…” pool might be relevant. Ensure that the pool configuration (e.g., concurrency, resources) is consistent between the notebook and API executions.

  5. Additional Stages in API Execution: The extra stage (mapPartitionsInternal) in the API execution could be a clue. Investigate what this stage does and whether it’s expected in your query.

Remember to thoroughly review your API code and compare it with the notebook execution to identify any discrepancies. If you encounter any specific error messages or logs related to the API execution, please share them for further analysis. 😊

For more information, you can refer to the Databricks community forum.1 Additionally, you can explore general solutions for handling ‘NoneType’ errors in Python here2 and here3.4

Feel free to provide more details or ask additional questions if needed! 🚀

View solution in original post

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @Nastia, The “TypeError: ‘NoneType’ object is not iterable” error typically occurs when you try to iterate over a variable that has a value of None.

Let’s explore some possible solutions to address this issue:

  1. Check for None before Iterating: Ensure that the variable you’re trying to iterate over is not None. You can add a check like this before the loop:

    if my_list is not None:
        for item in my_list:
            # Your code logic here
    
  2. Verify the Query Execution: Since both queries are marked as SUCCESS in SparkUI, the issue might not be with the query execution itself. However, it’s essential to verify that the query execution is indeed successful when called from FastAPI.

  3. Investigate the API Execution: Since the API query doesn’t have stdout logs, it’s challenging to diagnose the issue directly. Here are some steps you can take to investigate further:

    • Check if the API call is correctly passing the query to Databricks.
    • Verify that the API call is using the same configuration (e.g., cluster settings, authentication) as the notebook execution.
    • Inspect any relevant logs or error messages specific to the API execution.
  4. Pool Configuration: The fact that the API query is sending to the “thriftsever-session-…” pool might be relevant. Ensure that the pool configuration (e.g., concurrency, resources) is consistent between the notebook and API executions.

  5. Additional Stages in API Execution: The extra stage (mapPartitionsInternal) in the API execution could be a clue. Investigate what this stage does and whether it’s expected in your query.

Remember to thoroughly review your API code and compare it with the notebook execution to identify any discrepancies. If you encounter any specific error messages or logs related to the API execution, please share them for further analysis. 😊

For more information, you can refer to the Databricks community forum.1 Additionally, you can explore general solutions for handling ‘NoneType’ errors in Python here2 and here3.4

Feel free to provide more details or ask additional questions if needed! 🚀

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group