Databricks Community

RameshChejarla · ‎04-30-2025

Hi everyone,

I have implemented Auto loader working as expected. i need to track the files which are loaded into stage table.

Here is the issue, the file tracking table need to create in snowflake from here i need to track the files.

How to connect databricks and snowflake , pls suggest.

Louis_Frolio · ‎04-30-2025

To connect Databricks to Snowflake and set up a system for tracking files loaded into your Snowflake stage table, you can use the suggested following approach:

Configure Databricks Snowflake Connector:
Databricks provides a built-in Snowflake connector that integrates with Snowflake. Steps to set it up include:
- Add the Snowflake JDBC driver and Snowflake Spark connector library to your Databricks environment.
- Set the connection options such as sfURL (Snowflake account URL), sfWarehouse (Snowflake warehouse), sfDatabase (database name), sfSchema (schema name), sfRole (optional, Snowflake role), and the authentication credentials (sfUser and sfPassword).
Example connection code in Python: ```python options = { "sfURL": "<Your Snowflake URL>", "sfWarehouse": "<Your Snowflake Warehouse>", "sfDatabase": "<Your Snowflake Database>", "sfSchema": "<Your Snowflake Schema>", "sfRole": "<Optional Your Snowflake Role>", "sfUser": "<Your Username>", "sfPassword": "<Your Password>" }

snowflake_df = spark.read.format("snowflake").options(**options).option("dbtable", "<Your Snowflake Table>").load() ```

Similarly, you can use the .write method to store files into Snowflake once they are processed and loaded by Auto Loader.
Set Up File Tracking Using Auto Loader:
Auto Loader inherently tracks file ingestion progress by maintaining metadata in its checkpoint location. This ensures exactly-once ingestion and fault tolerance when processing files. However, since you wish to track this file metadata in Snowflake:
- You can capture the file metadata (such as file name, timestamp, and status) during ingestion via Auto Loader.
- Write this metadata to a tracking table in Snowflake using the Snowflake connector.
For example, while using Auto Loader: ```python from pyspark.sql.functions import input_file_name

# Load files with Auto Loader df = (spark.readStream .format("cloudFiles") .option("cloudFiles.format", "csv") .load("<Your Cloud Storage Path>") .withColumn("source_file", input_file_name())) # Capture file information

# Write DataFrame to Snowflake (assuming df now has both file records and metadata) df.write.format("snowflake").options(**options).option("dbtable", "<Snowflake Tracking Table>").save() ```
Advantages of Track File Metadata in Snowflake:
By leveraging Snowflake's capabilities as the target system, you can:
- Query file tracking data for auditing purposes.
- Join the tracking table with the actual stage table to ensure that all files are accounted for and processed correctly.

This process combines the incremental ingestion power of Auto Loader in Databricks with the centralized tracking and querying power of Snowflake.

Hope this helps.

RameshChejarla · ‎05-01-2025

I am trying to connect Snowflake to databricks using Secret scope. I do not have username and password to the snowflake.

Can you pls suggest on this.

RameshChejarla · ‎04-30-2025

Thanks for your response , will try and let you know

Databricks Community

Databricks(AWS) to snowflake connection

Join Us as a Local Community Builder!

Lakehouse, Lagers & Legends — Bangalore Meetup | December 13

🌟 Community Pulse: Your Weekly Roundup! November 21 – 27, 2025

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐