cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Real-time output missing when using “Upload and Run File” from VS Code

tinodj
Visitor

I am running Python files on a Databricks cluster using the VS Code Databricks extension, specifically the “Upload and Run File” command.

I cannot get real-time output in the Debug Console. I have checked the official docs:

https://learn.microsoft.com/en-us/azure/databricks/dev-tools/vscode-ext/tutorial

https://github.com/databricks/databricks-vscode/blob/release-v2.10.3/packages/databricks-vscode/DATA...

but these do not address the issue.

The behavior I see is:

  1. While the script is running, no output is shown in the Debug Console.

  2. When the script finishes, all output appears at once.

  3. If the script fails with an exception, only the error is shown and none of the printed output appears.

This makes it very difficult to test and debug. If this is expected behavior, I would like to know what the recommended or best-practice workflow is for running and testing a standalone Python file on a Databricks cluster with live output.

Below is a minimal example based on Databricks’ own sample script. I added a loop with prints and sleeps to demonstrate the missing streaming output:

 

from pyspark.sql import SparkSession
from pyspark.sql.types import *

spark = SparkSession.builder.getOrCreate()

schema = StructType([
    StructField('CustomerID', IntegerType(), False),
    StructField('FirstName', StringType(), False),
    StructField('LastName', StringType(), False)
])

data = [
    [1000, 'Matthijs', 'Oosterhout-Buntjes'],
    [1001, 'Joost', 'van Brunswijk'],
    [1002, 'Stan', 'Bokenkamp']
]

customers = spark.createDataFrame(data, schema)
customers.show()

import time
for i in range(100):
    print(f'{i}-Output')
    time.sleep(1)

time.sleep(10)
raise Exception("Demo failure")

 

 

0 REPLIES 0

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now