How to make streaming files?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-16-2025 02:37 PM
Thanks for reviewing my threads.
I am trying to test streaming table /files in databricks FREE edition.
-- Create test streaming table
CREATE OR REFRESH STREAMING TABLE user.demo.test_bronze_st AS
SELECT * FROM STREAM read_files('/Volumes/xxx_ws/demo/raw_files/test');
I created test file via note upload (data upload in catalog). It created the file, but not treating this as stream file.
The above create table failing
but
CREATE OR REFRESH TABLE user.demo.test_bronze_st AS
SELECT * FROM read_files('/Volumes/xxx_ws/demo/raw_files/test');
is working. The output window shows 0 rows created, but when I checked the catalog, it shows a different story. I see the data there in user.demo.test_bronze_st table.
Is this a right behavior?
How do I make a streaming file with input data files sitting on my local window folder?
Thanks .
- Labels:
-
Spark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-17-2025 12:06 AM
@RIDBX The Free Edition only allows access to serverless compute resources, and many advanced streaming features are not supported. For example, custom storage locations and online/streaming tables are explicitly noted as unsupported features in this tier.
https://docs.databricks.com/aws/en/getting-started/free-edition-limitations
Databricks in general requires files to be present in cloud storage mounted to your workspace https://docs.databricks.com/aws/en/getting-started/free-edition-limitations
Alternate approaches
- simulate streaming by incrementally adding new files or data batches to a source folder or table and then re-running batch queries that consume only the new data
- Upload files in small increments to a cloud-mounted directory (like DBFS). Then use Auto Loader with batch mode to process newly added files during each job run. This mimics streaming ingestion
- In notebooks, use loops or scheduled notebook jobs that periodically append new data files or rows into a Delta table, reading from that table each time as new input arrives
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-17-2025 09:36 AM
Thanks for weighing in. Are you saying
CREATE OR REFRESH STREAMING TABLE user.demo.test_bronze_st cannot be used in FREE Edition?
If we can use it, how do to create STREAM read_files('/Volumes/xxx_ws/demo/raw_files/test.csv'),
where .csv sitting on local drive?
Thanks .