cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to register datasets for Detectron2

SarahDorich
New Contributor II

I'm trying to run a Detectron2 model in Databricks and cannot figure out how to register my train, val and test datasets. My datasets live in an Azure data lake. I have tried the following with no luck. Any help is appreciated.

1) Specifying full path to Azure:

path_to_data = "abfss://<>@<>.dfs.core.windows.net/recommender/house-detector-datasets"

from detectron2.data.datasets import register_coco_instances

register_coco_instances("house_train3", {}, f"{path_to_data}/train/instances_default.json", f"{path_to_data}/train")

2) Moving to temporary local storage first:

import os

os.mkdir("house-detector-datasets")

my_blob_folder = "abfss://<>@<>.dfs.core.windows.net/recommender/house-detector-datasets"

dbutils.fs.cp(my_blob_folder, "house-detector-datasets", recurse=True)

path_to_data = "house-detector-datasets"

register_coco_instances("house_train4", {}, f"{path_to_data}/train/instances_default.json", f"{path_to_data}/train")

3) Moving to dbfs first:

Same code as 2) except moving to dbfs:/tmp/.

In all of these cases, I get the error when I try and access my registered datasets (for example, the code below fails with the error "No such file or directory")...

my_dataset_train_metadata = MetadataCatalog.get("house_train3") dataset_dicts = DatasetCatalog.get("house_train3")

3 REPLIES 3

matthews163
New Contributor II

mygiftcardsite Wrote:

I think this might help you

<code>from detectron2.data.datasets import register_coco_instances
register_coco_instances("YourTrainDatasetName", {},"path to train.json", "path to train image folder")
register_coco_instances("YourTestDatasetName", {}, "path to test.json", "path to test image folder")

Let me know if it works for you.I have trained detectron2 using this.

I have trained using this as well but not in Databricks (it's what I'm trying to get working). For some reason, the paths that I'm specifying the model cannot find. What did your paths look like to your datasets?

Thurman
New Contributor II

Register your dataset Optionally, register metadata for your dataset.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!