- 3408 Views
- 1 replies
- 1 kudos
Machine Learning Model Deployment on Databricks with Unity Catalog
Hi everyone! I found it would be helpful to document and share my experiences navigating model deployment on Databricks with the recent changes to deploy models to Unity Catalog instead of the Workspace Model Registry. https://medium.com/p/7d04e85395...
- 3408 Views
- 1 replies
- 1 kudos
- 1 kudos
Thanks for sharing this in-depth piece, @ac10 . Your walkthrough of model deployment via Unity Catalog is clear and practical—especially the insight about handling model signatures when working with Spark DataFrames. This will definitely help practit...
- 1 kudos
- 3395 Views
- 2 replies
- 0 kudos
Error when creating model env using 'virtualenv' with DBR 14.3
We were trying to inference from a logged model but had the following errorPreviously, we had been using `conda` as the environment manager, but that is no longer supported. I tried to update pyenv as some suggested but didn't get anywhere. Any insig...
- 3395 Views
- 2 replies
- 0 kudos
- 0 kudos
Hello @drjb1010 , This is a known issue with DBR 14.3 where the `virtualenv` environment manager fails because it depends on `pyenv` to install specific Python versions, but `pyenv` is either not installed or not properly configured in the runtime e...
- 0 kudos
- 3277 Views
- 1 replies
- 0 kudos
Patient Risk Score based on health history: Unable to create data folder for artifacts in S3 bucket
Hi All,we're using the below git project to build PoC on the concept of "Patient-Level Risk Scoring Based on Condition History": https://github.com/databricks-industry-solutions/hls-patient-riskI was able to import the solution into Databricks and ru...
- 3277 Views
- 1 replies
- 0 kudos
- 0 kudos
Greetings @SreeRam , here are some suggestions for you. Based on the error you're encountering with the hls-patient-risk solution accelerator, this is a common issue related to MLflow artifact access and storage configuration in Databricks. The probl...
- 0 kudos
- 3591 Views
- 1 replies
- 1 kudos
Table-Model Lineage for models without online Feature Lookups
Hi community,I am looking for the recommended way to achieve table-model lineage in Unity Catalog for models that don't use Feature Lookups but only offline features. When I use FeatureEngineeringClient.create_training_set with feature_lookups + mlfl...
- 3591 Views
- 1 replies
- 1 kudos
- 1 kudos
Hey @ssequ sorry this fell through the cracks but I have some ideas for you to consider. You can get Unity Catalog table→model lineage without Feature Lookups by logging the training datasets to MLflow and registering the model in Unity Catalog. ...
- 1 kudos
- 4214 Views
- 1 replies
- 0 kudos
AutoGluon MLflow integration
I am working on a personalized price package recommendation and implemented an AutoGluon code integrating it with MLflow.The code has been created in a modular fashion to be used by other team members. They just need to pass the data, target column a...
- 4214 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @cleversuresh Thanks for sharing the code and the context. Here are the core issues I see and how to fix them so MLflow logging works reliably on Databricks. What’s breaking MLflow logging in your code Your PyFunc wrapper loads the AutoGluon mod...
- 0 kudos
- 3494 Views
- 1 replies
- 0 kudos
AutoML master notebook failing
I have recently been able to run AutoML successfully on a certain dataset. But it has just failed on a second dataset of similar construction, before being able to produce any machine learning training runs or output. The Experiments page says```Mo...
- 3494 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @dkxxx-rc , Thanks for the detailed context. This error is almost certainly coming from AutoML’s internal handling of imbalanced data and sampling, not your dataset itself. The internal column _automl_sample_weight_0000 is created by AutoML when i...
- 0 kudos
- 1206 Views
- 7 replies
- 2 kudos
Resolved! What is the most efficient way of running sentence-transformers on a Spark DataFrame column?
We're trying to run the bundled sentence-transformers library from SBert in a notebook running Databricks ML 16.4 on an AWS g4dn.2xlarge [T4] instance.However, we're experiencing out of memory crashes and are wondering what the optimal to run sentenc...
- 1206 Views
- 7 replies
- 2 kudos
- 2 kudos
@excavator-matt I’d recommend a quick refresher on the Pandas API on Spark to understand the implementation details. This video breaks it down clearly: https://youtu.be/tdZDotqKtps?si=pcIzCUYs2s_TeQKx Hope this helps. — Louis
- 2 kudos
- 1522 Views
- 8 replies
- 1 kudos
Resolved! Importing sentence-transformers no longer works on Databricks runtime 17.2 ML
In Databricks Runtime 16.4 LTS for Machine Learning, I am used to be able to import sentence-transformers without any installation as it is part of the runtime with from sentence_transformers import SentenceTransformer.In this case I am running on a ...
- 1522 Views
- 8 replies
- 1 kudos
- 1 kudos
I now upgraded to the new 17.3 LTS ML and it now works. I didn't try 17.2 ML, but with 17.3 ML available, I don't see any reason to use it anymore.
- 1 kudos
- 4931 Views
- 6 replies
- 1 kudos
Resolved! Model Serving Endpoint Creation through API
Hello,I am trying to create a model serving endpoint via the API as explained here: https://docs.databricks.com/api/workspace/servingendpoints/createI created a trusted IAM role with access to DynamoDB for the feature store. I try to use this field,"...
- 4931 Views
- 6 replies
- 1 kudos
- 1 kudos
If you're using the databricks terraform provider, make sure the role's name matches the instance-profile name.If not, use the `iam_role_arn` attribute to explicitly set the role's arn when creating the databricks instance profileresource "databricks...
- 1 kudos
- 3665 Views
- 1 replies
- 0 kudos
AutoML "need to sample" not working as expected
tl; dr:When the AutoML run realizes it needs to do sampling because the driver / worker node memory is not enough to load / process the entire dataset, it fails. A sample weight column is NOT provided by me, but I believe somewhere in the process the...
- 3665 Views
- 1 replies
- 0 kudos
- 0 kudos
Hey @sangramraje , sorry for the late response. I wanted to check in to see if this is still an issue with the latest release? Please let me know. Cheers, Louis.
- 0 kudos
- 3777 Views
- 1 replies
- 1 kudos
AutoML Doesn't Work Due to Not being able to generate the EDA notebook
HiI'm trying run AutoML classification experiment with a dataset that I have made, and am experiencing this issue even after I have purposely downsampled my dataset before running it into the AutoML experiment. It appears that there is no way for me ...
- 3777 Views
- 1 replies
- 1 kudos
- 1 kudos
Hey @adoodsonruby , sorry this got lost in the shuffle. Have you tried again recently? I believe limits have been increased that would remove this impediment. Let us know, Louis.
- 1 kudos
- 3460 Views
- 1 replies
- 0 kudos
Error to create an endpoint of databricks with 2 primary keys online table
I have a delta table that has a primary key conformed by 2 fields (accountId,ruleModelVersionDesc) and I have also created an online table that has the same primary key, but when I create a feature spec to create an endpoint I get the following error...
- 3460 Views
- 1 replies
- 0 kudos
- 0 kudos
Hey @lchicoma , sorry for the delayed response. Thanks for sharing the error and context—this looks like a parsing issue in the feature specification rather than a problem with Delta or the runtime versions. What changed recently There was an inci...
- 0 kudos
- 997 Views
- 1 replies
- 0 kudos
🐞 Stuck on LightGBM Distributed Training in PySpark – Hanging After Socket Communication
My Setup:I'm trying to run distributed LightGBM training using synapseml.lightgbm.LightGBMRegressor in PySpark.Cluster Details:Spark version: 3.5.1 (compatible with PySpark 3.5.6)PySpark version: 3.5.6synapseml: v0.11.1 (latest)Spark Cluster: 3 Hetzn...
- 997 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @amanjethani , Thanks for laying out the setup and symptoms so clearly. The hang likely occurs because LightGBM’s distributed network either doesn’t fully form between executors or because the expected task count doesn’t match actual tasks, leadin...
- 0 kudos
- 3444 Views
- 1 replies
- 0 kudos
Can't query Legacy Serving Endpoint
Hi,I was able to deploy an endpoint using legacy serving (It's the only option we have to deploy endpoints in DB). Now I am having trouble querying the endpoint itself. When I try to query it I get the following error: Here is the code I am using ...
- 3444 Views
- 1 replies
- 0 kudos
- 0 kudos
Hey @semsim , sorry for the delayed response. Thanks for the screenshot—this pinpoints the problem. Root cause from the error Your model’s predict path is trying to create or write to /Workspace/Shared, and the serving container does not permit t...
- 0 kudos
- 4118 Views
- 1 replies
- 1 kudos
Multi-tenant recommendation system (Machine learning)
Hello,I am looking to build a multi-tenant machine learning recommender system in Azure Databricks. The idea is to have a single shared model, where each tenant can use the same model to train on their own unique dataset. Essentially, while the model...
- 4118 Views
- 1 replies
- 1 kudos
- 1 kudos
@Kasen , sorry for the delayed response. Here are some things to consider regarding your question. Azure Databricks is well-suited for a shared-architecture, tenant‑isolated recommender system. Below is a pragmatic blueprint, the isolation model o...
- 1 kudos
Join Us as a Local Community Builder!
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now-
Access control
3 -
Access Data
2 -
AccessKeyVault
1 -
ADB
2 -
Airflow
1 -
Amazon
2 -
Apache
1 -
Apache spark
3 -
APILimit
1 -
Artifacts
1 -
Audit
1 -
Autoloader
6 -
Autologging
2 -
Automation
2 -
Automl
39 -
Aws databricks
1 -
AWSSagemaker
1 -
Azure
32 -
Azure active directory
1 -
Azure blob storage
2 -
Azure data lake
1 -
Azure Data Lake Storage
3 -
Azure data lake store
1 -
Azure databricks
32 -
Azure event hub
1 -
Azure key vault
1 -
Azure sql database
1 -
Azure Storage
2 -
Azure synapse
1 -
Azure Unity Catalog
1 -
Azure vm
1 -
AzureML
2 -
Bar
1 -
Beta
1 -
Better Way
1 -
BI Integrations
1 -
BI Tool
1 -
Billing and Cost Management
1 -
Blob
1 -
Blog
1 -
Blog Post
1 -
Broadcast variable
1 -
Business Intelligence
1 -
CatalogDDL
1 -
Centralized Model Registry
1 -
Certification
2 -
Certification Badge
1 -
Change
1 -
Change Logs
1 -
Check
2 -
Classification Model
1 -
Cloud Storage
1 -
Cluster
10 -
Cluster policy
1 -
Cluster Start
1 -
Cluster Termination
2 -
Clustering
1 -
ClusterMemory
1 -
CNN HOF
1 -
Column names
1 -
Community Edition
1 -
Community Edition Password
1 -
Community Members
1 -
Company Email
1 -
Condition
1 -
Config
1 -
Configure
3 -
Confluent Cloud
1 -
Container
2 -
ContainerServices
1 -
Control Plane
1 -
ControlPlane
1 -
Copy
1 -
Copy into
2 -
CosmosDB
1 -
Courses
2 -
Csv files
1 -
Dashboards
1 -
Data
8 -
Data Engineer Associate
1 -
Data Engineer Certification
1 -
Data Explorer
1 -
Data Ingestion
2 -
Data Ingestion & connectivity
11 -
Data Quality
1 -
Data Quality Checks
1 -
Data Science & Engineering
2 -
databricks
5 -
Databricks Academy
3 -
Databricks Account
1 -
Databricks AutoML
9 -
Databricks Cluster
3 -
Databricks Community
5 -
Databricks community edition
4 -
Databricks connect
1 -
Databricks dbfs
1 -
Databricks Feature Store
1 -
Databricks Job
1 -
Databricks Lakehouse
1 -
Databricks Mlflow
4 -
Databricks Model
2 -
Databricks notebook
10 -
Databricks ODBC
1 -
Databricks Platform
1 -
Databricks Pyspark
1 -
Databricks Python Notebook
1 -
Databricks Runtime
9 -
Databricks SQL
8 -
Databricks SQL Permission Problems
1 -
Databricks Terraform
1 -
Databricks Training
2 -
Databricks Unity Catalog
1 -
Databricks V2
1 -
Databricks version
1 -
Databricks Workflow
2 -
Databricks Workflows
1 -
Databricks workspace
2 -
Databricks-connect
1 -
DatabricksContainer
1 -
DatabricksML
6 -
Dataframe
3 -
DataSharing
1 -
Datatype
1 -
DataVersioning
1 -
Date Column
1 -
Dateadd
1 -
DB Notebook
1 -
DB Runtime
1 -
DBFS
5 -
DBFS Rest Api
1 -
Dbt
1 -
Dbu
1 -
DDL
1 -
DDP
1 -
Dear Community
1 -
DecisionTree
1 -
Deep learning
4 -
Default Location
1 -
Delete
1 -
Delt Lake
4 -
Delta lake table
1 -
Delta Live
1 -
Delta Live Tables
6 -
Delta log
1 -
Delta Sharing
3 -
Delta-lake
1 -
Deploy
1 -
DESC
1 -
Details
1 -
Dev
1 -
Devops
1 -
Df
1 -
Different Notebook
1 -
Different Parameters
1 -
DimensionTables
1 -
Directory
3 -
Disable
1 -
Distribution
1 -
DLT
6 -
DLT Pipeline
3 -
Dolly
5 -
Dolly Demo
2 -
Download
2 -
EC2
1 -
Emr
2 -
Ensemble Models
1 -
Environment Variable
1 -
Epoch
1 -
Error handling
1 -
Error log
2 -
Eventhub
1 -
Example
1 -
Experiments
4 -
External Sources
1 -
Extract
1 -
Fact Tables
1 -
Failure
2 -
Feature Lookup
2 -
Feature Store
61 -
Feature Store API
2 -
Feature Store Table
1 -
Feature Table
6 -
Feature Tables
4 -
Features
2 -
FeatureStore
2 -
File Path
2 -
File Size
1 -
Fine Tune Spark Jobs
1 -
Forecasting
2 -
Forgot Password
2 -
Garbage Collection
1 -
Garbage Collection Optimization
1 -
Github
2 -
Github actions
2 -
Github Repo
2 -
Gitlab
1 -
GKE
1 -
Global Init Script
1 -
Global init scripts
4 -
Governance
1 -
Hi
1 -
Horovod
1 -
Html
1 -
Hyperopt
4 -
Hyperparameter Tuning
2 -
Iam
1 -
Image
3 -
Image Data
1 -
Inference Setup Error
1 -
INFORMATION
1 -
Input
1 -
Insert
1 -
Instance Profile
1 -
Int
2 -
Interactive cluster
1 -
Internal error
1 -
Invalid Type Code
1 -
IP
1 -
Ipython
1 -
Ipywidgets
1 -
JDBC Connections
1 -
Jira
1 -
Job
4 -
Job Parameters
1 -
Job Runs
1 -
Join
1 -
Jsonfile
1 -
Kafka consumer
1 -
Key Management
1 -
Kinesis
1 -
Lakehouse
1 -
Large Datasets
1 -
Latest Version
1 -
Learning
1 -
Limit
3 -
LLM
3 -
LLMs
2 -
Local computer
1 -
Local Machine
1 -
Log Model
2 -
Logging
1 -
Login
1 -
Logs
1 -
Long Time
2 -
Low Latency APIs
2 -
LTS ML
3 -
Machine
3 -
Machine Learning
24 -
Machine Learning Associate
1 -
Managed Table
1 -
Max Retries
1 -
Maximum Number
1 -
Medallion Architecture
1 -
Memory
3 -
Metadata
1 -
Metrics
3 -
Microsoft azure
1 -
ML Lifecycle
4 -
ML Model
4 -
ML Practioner
3 -
ML Runtime
1 -
MlFlow
75 -
MLflow API
5 -
MLflow Artifacts
2 -
MLflow Experiment
6 -
MLflow Experiments
3 -
Mlflow Model
10 -
Mlflow registry
3 -
Mlflow Run
1 -
Mlflow Server
5 -
MLFlow Tracking Server
3 -
MLModels
2 -
Model Deployment
4 -
Model Lifecycle
6 -
Model Loading
2 -
Model Monitoring
1 -
Model registry
5 -
Model Serving
14 -
Model Serving Cluster
2 -
Model Serving REST API
6 -
Model Training
2 -
Model Tuning
1 -
Models
8 -
Module
3 -
Modulenotfounderror
1 -
MongoDB
1 -
Mount Point
1 -
Mounts
1 -
Multi
1 -
Multiline
1 -
Multiple users
1 -
Nested
1 -
New Feature
1 -
New Features
1 -
New Workspace
1 -
Nlp
3 -
Note
1 -
Notebook
6 -
Notification
2 -
Object
3 -
Onboarding
1 -
Online Feature Store Table
1 -
OOM Error
1 -
Open Source MLflow
4 -
Optimization
2 -
Optimize Command
1 -
OSS
3 -
Overwatch
1 -
Overwrite
2 -
Packages
2 -
Pandas udf
4 -
Pandas_udf
1 -
Parallel
1 -
Parallel processing
1 -
Parallel Runs
1 -
Parallelism
1 -
Parameter
2 -
PARAMETER VALUE
2 -
Partner Academy
1 -
Pending State
2 -
Performance Tuning
1 -
Photon Engine
1 -
Pickle
1 -
Pickle Files
2 -
Pip
2 -
Points
1 -
Possible
1 -
Postgres
1 -
Pricing
2 -
Primary Key
1 -
Primary Key Constraint
1 -
Progress bar
2 -
Proven Practices
2 -
Public
2 -
Pymc3 Models
2 -
PyPI
1 -
Pyspark
6 -
Python
21 -
Python API
1 -
Python Code
1 -
Python Function
3 -
Python Libraries
1 -
Python Packages
1 -
Python Project
1 -
Pytorch
3 -
Reading-excel
2 -
Redis
2 -
Region
1 -
Remote RPC Client
1 -
RESTAPI
1 -
Result
1 -
Runtime update
1 -
Sagemaker
1 -
Salesforce
1 -
SAP
1 -
Scalability
1 -
Scalable Machine
2 -
Schema evolution
1 -
Script
1 -
Search
1 -
Security
2 -
Security Exception
1 -
Self Service Notebooks
1 -
Server
1 -
Serverless
1 -
Serving
1 -
Shap
2 -
Size
1 -
Sklearn
1 -
Slow
1 -
Small Scale Experimentation
1 -
Source Table
1 -
Spark config
1 -
Spark connector
1 -
Spark Error
1 -
Spark MLlib
2 -
Spark Pandas Api
1 -
Spark ui
1 -
Spark Version
2 -
Spark-submit
1 -
SparkML Models
2 -
Sparknlp
3 -
Spot
1 -
SQL
19 -
SQL Editor
1 -
SQL Queries
1 -
SQL Visualizations
1 -
Stage failure
2 -
Storage
3 -
Stream
2 -
Stream Data
1 -
Structtype
1 -
Structured streaming
2 -
Study Material
1 -
Summit23
2 -
Support
1 -
Support Team
1 -
Synapse
1 -
Synapse ML
1 -
Table
4 -
Table access control
1 -
Tableau
1 -
Task
1 -
Temporary View
1 -
Tensor flow
1 -
Test
1 -
Timeseries
1 -
Timestamps
1 -
TODAY
1 -
Training
6 -
Transaction Log
1 -
Trying
1 -
Tuning
2 -
UAT
1 -
Ui
1 -
Unexpected Error
1 -
Unity Catalog
12 -
Use Case
2 -
Use cases
1 -
Uuid
1 -
Validate ML Model
2 -
Values
1 -
Variable
1 -
Vector
1 -
Versioncontrol
1 -
Visualization
2 -
Web App Azure Databricks
1 -
Weekly Release Notes
2 -
Whl
1 -
Worker Nodes
1 -
Workflow
2 -
Workflow Jobs
1 -
Workspace
2 -
Write
1 -
Writing
1 -
Z-ordering
1 -
Zorder
1
- « Previous
- Next »
| User | Count |
|---|---|
| 90 | |
| 39 | |
| 38 | |
| 25 | |
| 25 |