cancel
Showing results for 
Search instead for 
Did you mean: 
Data Governance
cancel
Showing results for 
Search instead for 
Did you mean: 

Loading Keras model from ADLS

crimson
New Contributor II

I want to load a Keras model using abfss. As far as I know, this is not possible, and the only way of doing this is by copying the model from ADLS to dbfs.

I am concerned about the data governance, as everybody could potentionally modify/delete the model in dbfs. What would be the correct way to do this?

 

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @crimson

  • As you mentioned, the straightforward approach is to copy the model from ADLS to Databricks File System (DBFS) using dbutils.fs.cp.
  • While this method works, it does raise concerns about data governance, as anyone with access to DBFS could potentially modify or delete the model.
  • To mitigate this risk, consider setting up proper access controls and permissions for DBFS directories where the model is stored. Limit access to authorized users only.
  • Ensure that only authorized users have access to the DBFS location where the model is stored.
  • Use Access Control Lists (ACLs) or Shared Access Signatures (SAS) to restrict access further.
  • Regularly audit access logs to monitor any unauthorized activity.
  • While Keras doesn’t natively support loading models from URIs, you can work around this limitation.
  • Use the following steps:
    • Step 1: Copy the model from ADLS to a local DBFS path using dbutils.fs.cp.
    • Step 2: Load the model from the local DBFS path using Keras.
  • Serialize your Keras model (architecture and weights) to a file format that Keras understands (e.g., HDF5 format).
  • Store the serialized model in ADLS.
  • When needed, deserialize the model from ADLS and load it into Databricks.
  • Prioritize data governance and access controls to safeguard your Keras models effectively. 🛡🔒

crimson
New Contributor II

Hi @Kaniz,

Can you please elaborate on these or share related documentation?

  • Use Access Control Lists (ACLs) or Shared Access Signatures (SAS) to restrict access further.
    Is it possible to use ACLs to control DBFS access?
  • Serialize your Keras model (architecture and weights) to a file format that Keras understands (e.g., HDF5 format).
    Any example you can provide?
  • When needed, deserialize the model from ADLS and load it into Databricks.
    This would cause the same issue, wouldn't it?

    Thank you