cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Can the community version of databrick run model training examples?

Eric76
New Contributor

Hi, 

Newcomer here. I am experimenting with the community version of databrick.

I wanted to run the notebook example provided here https://community.cloud.databricks.com/?o=6085264701896358#notebook/2691200955149229

It failed because it cannot import the data from 

/dbfs/databricks-datasets/wine-quality/winequality-white.csv.
anyway to workaround this. do I need to have a cloud account for this example
1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @Eric76,

Welcome to the Databricks community! Let’s address the issue you’re facing with importing data in your notebook.

The example notebook you’re trying to run relies on a dataset located at /dbfs/databricks-datasets/wine-quality/winequality-white.csv. This dataset contains information about white wine quality and is commonly used for regression or classification modeling.

Here are a few steps you can take to resolve the issue:

  1. Mount the Data: If you’re using Databricks Community Edition, you might need to mount the dataset to make it accessible within your notebook. To do this, follow these steps:

    • Click on the “Data” tab in the left sidebar.
    • Click “Add Data” and select “DBFS” as the source.
    • Enter the path /databricks-datasets/wine-quality/winequality-white.csv.
    • Choose a mount point (e.g., /mnt/wine-quality).
    • Click “Create Table” to create a table associated with the mounted data.
  2. Read Data Using Spark: Instead of directly reading the CSV file, use Spark to read the data. You can do this with the following code snippet in your notebook:

    # Read the data into a Spark DataFrame
    df = spark.read.csv('/dbfs/databricks-datasets/wine-quality/winequality-white.csv', header=True, inferSchema=True)
    
  3. Convert to Pandas DataFrame (Optional): If you prefer working with Pandas DataFrames, you can convert the Spark DataFrame to a Pandas DataFrame:

    # Convert Spark DataFrame to Pandas DataFrame
    df_pandas = df.toPandas()
    

Remember that Databricks Community Edition provides limited resources, so you might encounter some limitations. If you’re planning to work extensively with Databricks, consider exploring the full Databricks platform, which offers additional features and scalability.

Feel free to try the above steps, and let me know if you need further assistance! 😊

 
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!