04-26-2023 07:35 AM
Hey,
we have an issue in that we can access the SQL files whenever the notebook is in the repo path but whenever the CICD pipeline imports the repo notebooks and SQL files to the shared workspace, we can list the SQL files but can not read them.
we changed the file permissions and still getting the same error.
the error is OSError: [Errno 95] Operation not supported
Thanks
04-27-2023 05:36 AM
@Nermin Yehia looks there was a small confusion, thought sql files are data files. Ignore my previous comment. Did you get a chance to check below article which shows need to end to config for ADF to Databricks
https://learn.microsoft.com/en-us/azure/data-factory/solution-template-databricks-notebook
what ever path you add in ADF level pipeline config that will be picked, can you please make sure how config has been setted
learn.microsoft.com
Run a Databricks Notebook with the activity - Azure Data Factory
Learn how you can use the Databricks Notebook Activity in an Azure data factory to run a Databricks notebook against the databricks jobs cluster.
04-27-2023 05:50 AM
@karthik p Thank you so much.
there is no issue in ADF running the notebook.
the main issue is that we can access the SQL files from Workspace/Repos but can not open the file whenever it is imported to the /Workspace/Shared. seems like a bug in the Workspace but we don't know how to fix it.
Thanks
04-27-2023 07:04 AM
@Nermin Yehia can you post command that you are using to read data. also some where in pipeline are you passing /workspace/shared path to place your files. if so please change that to your folder in /users and try to read, if files are being placed in shared that might be security concern as per governance if that need to be visible to you that should be in your users. if that should be accessed by every one that can be in shared
04-27-2023 07:08 AM
here are the commands
04-27-2023 07:28 AM
@Nermin Yehia can you use this command to copy you files from shared
dbutils.fs.cp ("file:/workspace/shared/.....", "dbfs:/tmp/sqls/*")
and to read try to use
display(dbutils.fs.ls("file:/tmp/sqls/*"))
04-27-2023 07:35 AM
@karthik p I tested the following. uploaded the sql files to /User and it worked fine. so a notebook in the /Shared folder can access the sql in the /User folder but can not access in the ?Shared folder.
i used cp just to test. but in the original code, i'm using dbutils.fs.ls("file:/ the sql path")
04-27-2023 07:40 AM
@karthik p posted a file.
04-27-2023 07:43 AM
@karthik p
04-27-2023 08:36 AM
@Nermin Yehia Try this
with open("/Workspace/Shared/test.sql") as queryFile:
queryText = queryFile.read()
display(queryText)
04-27-2023 08:42 AM
@karthik p this is the way i read the files and causing the error
04-27-2023 11:03 AM
@karthik p how can i copy sql files with extensions?
the workspace cli documentation says extensions are excluded
04-27-2023 11:24 AM
@Nermin Yehia looks it's a bug, can you try to upgrade databricks utility to new version if you have not and see, that may help as you do not want to opt manual export or import
04-27-2023 08:46 AM
@Nermin Yehia looks .sql is missing for u file (problem_list_history_weekly is .sql file right)
04-27-2023 08:53 AM
@karthik p yes. it is showing up in the /User(read is working fine) but not in the /Shared(read operation cause the error). we use databricks workspace import_dir command to copy notebooks and sql files from /User to /Shared using CICD.
when i tried to find the reason for the extension disappearance I found that in the doc
https://docs.databricks.com/dev-tools/cli/workspace-cli.html
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group