Databricks Community

Drew_Prof · ‎02-03-2026

Hello Everyone,

I am creating a new college course on Database design and SQL analytics and have decided to use Databricks as our platform in the course. We are going to be using the Free Edition so students do not need to pay for access. I'm wondering what solutions people have found for creating datasets and sharing them with students? From what I can tell, the free edition limits sharing directly via emails and also limits Delta Shares. Is my only option to export to .csv files and then have the students create their own tables using the .csv file?

The same question goes for SQL editor scripts; I created some demos that I walked through in class but I would like to share the editor files directly. Is that possible using the Free Edition? My current work around is copying the SQL queries to a .txt file and the students copy & paste from the .txt into their own SQL editor.

Hoping there might be some easier sharing opportunities that I'm missing in the Free Edition.

Louis_Frolio · ‎02-03-2026

One other point and, quick win for course datasets on Free Edition.

Databricks Labs has a purpose‑built synthetic data toolkit: dbldatagen (Databricks Labs Data Generator). It’s open source and runs great on Free Edition with a simple notebook‑scoped install.

Install:
- In a notebook cell: %pip install dbldatagen.
Links:
- GitHub: https://github.com/databrickslabs/dbldatagen
- Docs: https://databrickslabs.github.io/dbldatagen/

Works out of the box on Databricks runtimes and Community/Free Edition via %pip (no special cluster libs).
No extra Python deps beyond what the Databricks runtime already includes for supported runtimes.
You can expose the generated DataFrame as a view and consume from other languages (SQL, Scala, R).
Comes with plug‑in style standard datasets to jump‑start common examples.
Supports multi‑table generation with cross‑references — perfect for relational concepts (FKs, dimensions/facts).

Copy/paste starter

%pip install dbldatagen

import dbldatagen as dg

dataspec = (
    dg.DataGenerator(spark, name="customers", rows=10_000)
      .withColumn("customer_id", "int", minValue=1, maxValue=10_000)
      .withColumn("name", "string", template=r"\w \w")
      .withColumn("email", "string", template=r"\w@\w.com")
      .withColumn("signup_date", "date", begin="2020-01-01", end="2024-12-31")
)

df = dataspec.build()
df.write.saveAsTable("customers")

Cheers, Lou

View solution in original post

Louis_Frolio · ‎02-03-2026

Hey @Drew_Prof ,

Short answer: With Databricks Free Edition, you can’t act as a Delta Sharing provider or use Marketplace to distribute data, and you don’t have access to account-level sharing features. Instead, the most reliable path is to distribute files (CSV/Parquet) and have students load them into their own workspaces using Unity Catalog volumes; for SQL, share .sql files or notebooks via a public Git repo or simple file upload/import. This keeps each student within their own Free Edition workspace and avoids quota/contention issues.

Recommended patterns that work well for a class

Datasets (tables) Option A — Distribute files; students load into their own volume (recommended)

You publish small-to-moderate CSV/Parquet files through your LMS or a public link (GitHub release, course site).
Students upload the files to a Unity Catalog volume in their own Free Edition workspace (Catalog > Volumes > Upload). Free Edition supports volumes; DBFS root is restricted.
Students create tables over those files. Example:

SQL -- one-time setup CREATE CATALOG IF NOT EXISTS workspace; CREATE SCHEMA IF NOT EXISTS workspace.default; CREATE VOLUME IF NOT EXISTS workspace.default.course_data;

-- after uploading orders.csv to the volume: CREATE TABLE IF NOT EXISTS workspace.default.orders USING CSV OPTIONS (header true, inferSchema true) LOCATION '/Volumes/workspace/default/course_data/orders.csv';

Option B — Provide a “bootstrap” notebook or SQL file

Ship a small notebook or .sql file that: (1) creates the volume, (2) gives students a step to upload files, (3) executes the CREATE TABLE commands. This minimizes copy/paste errors and standardizes table names.

Notes

Favor Parquet where possible to cut file size and speed up loads (especially useful under small-warehouse limits).
Avoid relying on external HTTP downloads from within the workspace; Free Edition outbound access is allowlisted and may not include arbitrary hosts.

SQL editor scripts and teaching materials Option C — Public Git repository

Put .sql files and notebooks in a public GitHub repo.
Students either:
- Use Git folders (if enabled for their Free Edition workspace) to clone the repo; or
- Download files from GitHub and use “Upload” in the Databricks Workspace or SQL Editor Files to import .sql or notebooks.
  This is the simplest way to share SQL editor content without depending on workspace invites. (Git folders are generally available in Databricks; if they’re not visible in a student’s Free Edition workspace, file upload still works.)

Option D — Export/import notebooks (.dbc or source)

Export notebooks as .dbc or source files and post them to the LMS.
Students import via Workspace > Import; then they can open the SQL cells in the editor.

What not to rely on in Free Edition

Delta Sharing as a provider or Marketplace distribution: provider objects are created at the account/metastore layer and Free Edition does not expose the account console/APIs; Marketplace provider access is explicitly disallowed.
Single shared instructor workspace for the whole class: one tiny SQL warehouse plus fair‑use quotas will bottleneck and may shut compute down for the day if exceeded.

If you really want “in‑platform” collaboration

You can add a small number of collaborators to a single workspace and co‑edit notebooks/SQL files in real‑time, but keep groups small and time‑boxed to avoid quotas.
For larger cohorts, stick with each student’s own Free Edition workspace + file/Git distribution.

Quick starter checklist you can reuse in your syllabus

Provide download links for datasets (CSV/Parquet) and a bootstrap SQL file/notebook that:
1. Creates catalog/schema/volume
2. Instructs students to upload files
3. Runs CREATE TABLE … USING CSV/Parquet LOCATION '/Volumes/…'
Host all SQL editor examples in a public GitHub repo as .sql files; add a README with “Upload into SQL Editor Files” instructions.
Keep file sizes modest and table counts reasonable to respect Free Edition limits.

Hope this helps, Louis.

Louis_Frolio · ‎02-03-2026

One other point and, quick win for course datasets on Free Edition.

Databricks Labs has a purpose‑built synthetic data toolkit: dbldatagen (Databricks Labs Data Generator). It’s open source and runs great on Free Edition with a simple notebook‑scoped install.

Install:
- In a notebook cell: %pip install dbldatagen.
Links:
- GitHub: https://github.com/databrickslabs/dbldatagen
- Docs: https://databrickslabs.github.io/dbldatagen/

Works out of the box on Databricks runtimes and Community/Free Edition via %pip (no special cluster libs).
No extra Python deps beyond what the Databricks runtime already includes for supported runtimes.
You can expose the generated DataFrame as a view and consume from other languages (SQL, Scala, R).
Comes with plug‑in style standard datasets to jump‑start common examples.
Supports multi‑table generation with cross‑references — perfect for relational concepts (FKs, dimensions/facts).

Copy/paste starter

%pip install dbldatagen

import dbldatagen as dg

dataspec = (
    dg.DataGenerator(spark, name="customers", rows=10_000)
      .withColumn("customer_id", "int", minValue=1, maxValue=10_000)
      .withColumn("name", "string", template=r"\w \w")
      .withColumn("email", "string", template=r"\w@\w.com")
      .withColumn("signup_date", "date", begin="2020-01-01", end="2024-12-31")
)

df = dataspec.build()
df.write.saveAsTable("customers")

Cheers, Lou