cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Can the community version of databrick run model training examples?

Eric76
New Contributor

Hi, 

Newcomer here. I am experimenting with the community version of databrick.

I wanted to run the notebook example provided here https://community.cloud.databricks.com/?o=6085264701896358#notebook/2691200955149229

It failed because it cannot import the data from 

/dbfs/databricks-datasets/wine-quality/winequality-white.csv.
anyway to workaround this. do I need to have a cloud account for this example
1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @Eric76,

Welcome to the Databricks community! Let’s address the issue you’re facing with importing data in your notebook.

The example notebook you’re trying to run relies on a dataset located at /dbfs/databricks-datasets/wine-quality/winequality-white.csv. This dataset contains information about white wine quality and is commonly used for regression or classification modeling.

Here are a few steps you can take to resolve the issue:

  1. Mount the Data: If you’re using Databricks Community Edition, you might need to mount the dataset to make it accessible within your notebook. To do this, follow these steps:

    • Click on the “Data” tab in the left sidebar.
    • Click “Add Data” and select “DBFS” as the source.
    • Enter the path /databricks-datasets/wine-quality/winequality-white.csv.
    • Choose a mount point (e.g., /mnt/wine-quality).
    • Click “Create Table” to create a table associated with the mounted data.
  2. Read Data Using Spark: Instead of directly reading the CSV file, use Spark to read the data. You can do this with the following code snippet in your notebook:

    # Read the data into a Spark DataFrame
    df = spark.read.csv('/dbfs/databricks-datasets/wine-quality/winequality-white.csv', header=True, inferSchema=True)
    
  3. Convert to Pandas DataFrame (Optional): If you prefer working with Pandas DataFrames, you can convert the Spark DataFrame to a Pandas DataFrame:

    # Convert Spark DataFrame to Pandas DataFrame
    df_pandas = df.toPandas()
    

Remember that Databricks Community Edition provides limited resources, so you might encounter some limitations. If you’re planning to work extensively with Databricks, consider exploring the full Databricks platform, which offers additional features and scalability.

Feel free to try the above steps, and let me know if you need further assistance! 😊

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group