Sunday
I have some json files existing in a specific volume when I try to search for them they don't appear but when I query the the volume using python I am able to get them and read their content.
Any help ?
Sunday
Hi @amirabedhiafi,
Unity Catalog volumes are a storage layer for files, so itโs normal that you can read JSON files from /Volumes/... with Python or SQL, but not have those same files show up as searchable document content in the workspace search experience. Databricks documents volume file access separately from workspace search and from working with files in Unity Catalog volumes.
The fact that the files are present and readable in the volume does not automatically mean their contents are indexed for search. JSON is absolutely supported as a file format in volumes and can be read programmatically, including with READ_FILES or standard Spark/Python workflows. But the built-in document-ingestion path for files in volumes, such as Knowledge Assistant, currently supports txt, pdf, md, ppt/pptx, and doc/docx for files-in-volume sources, which is why raw JSON files wonโt behave like indexed documents there.
If the goal is to make the JSON content searchable, the pattern is to first load the JSON into a Delta table, for example, using read_files(..., format => 'json'), and then, if needed, build a Databricks AI Search index on that table. AI Search indexes are created from Delta tables rather than directly from raw files in a volume.
If this answer resolves your question, could you mark it as โAccept as Solutionโ? That helps other users quickly find the correct fix.
Sunday
Hi @amirabedhiafi ,
The workspace search bar simply doesn't crawl volume file contents or filenames - it's scoped to registered Unity Catalog metadata objects (tables, models) and notebook text.
Use Catalog Explorer to browse visually, or continue using Python enumerate and read files programmatically. So, there's nothing broken - it's just working as designed.
If my answer was helpful, please consider marking it as accepted solution.
Sunday
Hi @amirabedhiafi ,
Catalog Explorer search won't return these files. This is likely because raw files in Volumes can change rapidly and aren't tracked in the system tables in the same way structured data is.
Instead, I would suggest using a Genie Space for this. Check out the example below:
Sunday
Hi @amirabedhiafi,
Unity Catalog volumes are a storage layer for files, so itโs normal that you can read JSON files from /Volumes/... with Python or SQL, but not have those same files show up as searchable document content in the workspace search experience. Databricks documents volume file access separately from workspace search and from working with files in Unity Catalog volumes.
The fact that the files are present and readable in the volume does not automatically mean their contents are indexed for search. JSON is absolutely supported as a file format in volumes and can be read programmatically, including with READ_FILES or standard Spark/Python workflows. But the built-in document-ingestion path for files in volumes, such as Knowledge Assistant, currently supports txt, pdf, md, ppt/pptx, and doc/docx for files-in-volume sources, which is why raw JSON files wonโt behave like indexed documents there.
If the goal is to make the JSON content searchable, the pattern is to first load the JSON into a Delta table, for example, using read_files(..., format => 'json'), and then, if needed, build a Databricks AI Search index on that table. AI Search indexes are created from Delta tables rather than directly from raw files in a volume.
If this answer resolves your question, could you mark it as โAccept as Solutionโ? That helps other users quickly find the correct fix.
Sunday
Hi @amirabedhiafi ,
The workspace search bar simply doesn't crawl volume file contents or filenames - it's scoped to registered Unity Catalog metadata objects (tables, models) and notebook text.
Use Catalog Explorer to browse visually, or continue using Python enumerate and read files programmatically. So, there's nothing broken - it's just working as designed.
If my answer was helpful, please consider marking it as accepted solution.
Sunday
Hi @amirabedhiafi ,
Catalog Explorer search won't return these files. This is likely because raw files in Volumes can change rapidly and aren't tracked in the system tables in the same way structured data is.
Instead, I would suggest using a Genie Space for this. Check out the example below: