Databricks Community

Direo · ‎05-01-2023

Can Databricks feature tables be stored outside of DBFS?

Priyag1 · ‎05-03-2023

@Direo Direo Feature tables are delta tables it can shared among different workspaces . But you are asking outside of dbfs , what exactly is requirement and where you want to store

Anonymous · ‎06-22-2023

Hi @Direo Direo

Does @Priyadarshini G answer help? If it does, would you be happy to mark it as best? If it doesn't, please tell us so we can help you.

the-sab · ‎06-27-2023

Yes, Databricks feature tables can be stored outside of Databricks File System (DBFS). You can store your feature tables in external storage systems such as Amazon S3, Azure Blob Storage, Azure Data Lake Storage, or Hadoop Distributed File System (HDFS).

To store your feature tables in external storage, you need to configure the storage system and provide the appropriate connection information when creating your Delta table. For example, when using Amazon S3, you would specify the S3 bucket path when creating the table.

Here's an example of how to create a Delta table in an Amazon S3 bucket using PySpark:

```python
from pyspark.sql import SparkSession

# Start a Spark session
spark = SparkSession.builder \
.appName("Databricks Feature Table on S3") \
.getOrCreate()

# Define a sample DataFrame
data = [("Alice", 34), ("Bob", 45), ("Cathy", 29)]
columns = ["Name", "Age"]
df = spark.createDataFrame(data, columns)

# Write the DataFrame to a Delta table in S3
delta_table_path = "s3a://your-bucket-name/your-delta-table-path/"
df.write.format("delta").mode("overwrite").save(delta_table_path)
```

Replace `your-bucket-name` and `your-delta-table-path` with the appropriate values for your Amazon S3 bucket and desired path. Note that you need to configure your S3 authentication and ensure that you have the necessary permissions to read and write to the specified bucket.

Databricks Community

Feature store feature table location

Photos

Connect with Databricks Users in Your Area

Virtual Learning Festival: 9 April - 30 April

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Data + AI Summit 2025 — registration now open!

Databricks DevConnect: Global Community Meetups for Data Engineers

Databricks Community Champion - February 2025 - Stefan Koch