cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to register datasets for Detectron2

SarahDorich
New Contributor II

I'm trying to run a Detectron2 model in Databricks and cannot figure out how to register my train, val and test datasets. My datasets live in an Azure data lake. I have tried the following with no luck. Any help is appreciated.

1) Specifying full path to Azure:

path_to_data = "abfss://<>@<>.dfs.core.windows.net/recommender/house-detector-datasets"

from detectron2.data.datasets import register_coco_instances

register_coco_instances("house_train3", {}, f"{path_to_data}/train/instances_default.json", f"{path_to_data}/train")

2) Moving to temporary local storage first:

import os

os.mkdir("house-detector-datasets")

my_blob_folder = "abfss://<>@<>.dfs.core.windows.net/recommender/house-detector-datasets"

dbutils.fs.cp(my_blob_folder, "house-detector-datasets", recurse=True)

path_to_data = "house-detector-datasets"

register_coco_instances("house_train4", {}, f"{path_to_data}/train/instances_default.json", f"{path_to_data}/train")

3) Moving to dbfs first:

Same code as 2) except moving to dbfs:/tmp/.

In all of these cases, I get the error when I try and access my registered datasets (for example, the code below fails with the error "No such file or directory")...

my_dataset_train_metadata = MetadataCatalog.get("house_train3") dataset_dicts = DatasetCatalog.get("house_train3")

3 REPLIES 3

matthews163
New Contributor II

mygiftcardsite Wrote:

I think this might help you

<code>from detectron2.data.datasets import register_coco_instances
register_coco_instances("YourTrainDatasetName", {},"path to train.json", "path to train image folder")
register_coco_instances("YourTestDatasetName", {}, "path to test.json", "path to test image folder")

Let me know if it works for you.I have trained detectron2 using this.

I have trained using this as well but not in Databricks (it's what I'm trying to get working). For some reason, the paths that I'm specifying the model cannot find. What did your paths look like to your datasets?

Thurman
New Contributor II

Register your dataset Optionally, register metadata for your dataset.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group