cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

AlexH
by Visitor
  • 25 Views
  • 2 replies
  • 0 kudos

Offline Feature Store in Databricks Serving

Hi, I am planning to deploy a model (pyfunc)  with Databricks Serving. During inference, my model needs to retrieve some data from delta tables. I could make these tables to an offline feature store as well.Latency is not so important. It doesnt matt...

  • 25 Views
  • 2 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

There is a ready feature engineering function for that:  # on non ML runtime please install databricks-feature-engineering>=0.13.0a3" from databricks.feature_engineering import FeatureEngineeringClient fe = FeatureEngineeringClient() from databrick...

  • 0 kudos
1 More Replies
JoaoPigozzo
by New Contributor II
  • 53 Views
  • 1 replies
  • 0 kudos

Best practices for structuring databricks workspaces for CI/CD and ML workflows

Hi everyone,I’m designing the CI/CD process for our environment environment focused on machine learning and data science projects, and I’d like to understand what the best practices are regarding workspace organization—especially when using Unity Cat...

  • 53 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

When designing a CI/CD process for Databricks environments — especially for machine learning and data science projects using Unity Catalog — enterprise-scale workspace organization should balance isolation, governance, and collaboration. The recommen...

  • 0 kudos
VivekWV
by New Contributor
  • 83 Views
  • 2 replies
  • 0 kudos

Safe Update Strategy for Online Feature Store Without Endpoint Disruption

Hi Team,We are implementing Databricks Online Feature Store using Lakebase architecture and have run into some constraints during development:Requirements:Deploy an offline table as a synced online table and create a feature spec that queries from th...

  • 83 Views
  • 2 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

The recommended way to safely update an online Databricks Feature Store without breaking the serving endpoint or causing downtime involves a version-controlled, atomic update pattern that preserves schema consistency and endpoint stability. Key Issue...

  • 0 kudos
1 More Replies
jeremy98
by Honored Contributor
  • 53 Views
  • 2 replies
  • 0 kudos

how to speed up inference?

Hi guys,I'm new to this concept, but we have several ML models that follow the same structure from the code. What I don’t fully understand is how to handle different types of models efficiently — right now, I need to loop through my items to get the ...

  • 53 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @jeremy98  I have not tried this - but could using Python's multiprocessing library to assign the inference for different models to different CPU cores be something you would want to give an attempt? Also here's a useful blog -  https://docs.datab...

  • 0 kudos
1 More Replies
spearitchmeta
by Contributor
  • 58 Views
  • 1 replies
  • 1 kudos

How does Databricks AutoML handle null imputation for categorical features by default?

Hi everyone I’m using Databricks AutoML (classification workflow) on Databricks Runtime 10.4 LTS ML+, and I’d like to clarify how missing (null) values are handled for categorical (string) columns by default.From the AutoML documentation, I see that:...

  • 58 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Hello @spearitchmeta , I looked internally to see if I could help with this and I found some information that will shed light on your question.   Here’s how missing (null) values in categorical (string) columns are handled in Databricks AutoML on Dat...

  • 1 kudos
AlbertWang
by Valued Contributor
  • 2620 Views
  • 1 replies
  • 1 kudos

Can I Replicate Azure Document Intelligence's Custom Table Extraction in Databricks?

I am using Azure Document Intelligence to get data from a table in a PDF file. The table's headers do not visually align with the values. Therefore, the standard and pre-built models cannot correctly read the data.I have built a custom-trained Azure ...

  • 2620 Views
  • 1 replies
  • 1 kudos
Latest Reply
dkushari
Databricks Employee
  • 1 kudos

Hi @AlbertWang, you can easily achieve this using AgenBricks - Information Extraction. Your PDFs will be converted to text using the ai_parse_document function and saved in a Databricks table. You can then create the agent using that text table to ge...

  • 1 kudos
MightyMasdo
by New Contributor III
  • 3132 Views
  • 3 replies
  • 7 kudos

Spark context not implemented Error when using Databricks connect

I am developing an application using databricks connect and when I try to use VectorAssembler I get the Error sc is not none Assertion Error. is there a workaround for this ?

  • 3132 Views
  • 3 replies
  • 7 kudos
Latest Reply
pibe1
New Contributor II
  • 7 kudos

Ran into exactly the same issue as @Łukasz1 After some googling, I found this SO post explaining the issue: later versions of databricks connect no longer support the SparkContext API. Our code is failing because the underlying library is trying to f...

  • 7 kudos
2 More Replies
tarunnagar
by New Contributor II
  • 210 Views
  • 1 replies
  • 1 kudos

Best Practices for Collaborative Notebook Development in Databricks

Hi everyone! I’m looking to learn more about effective strategies for collaborative development in Databricks notebooks. Since notebooks are often used by multiple data scientists, analysts, and engineers, managing collaboration efficiently is critic...

  • 210 Views
  • 1 replies
  • 1 kudos
Latest Reply
AbhaySingh
New Contributor II
  • 1 kudos

For version control, use this approach.Git Integration with Databricks ReposCore Features:Databricks Git Folders (Repos) provides native Git integration with visual UI and REST API access Supports all major providers: GitHub, GitLab, Azure DevOps, Bi...

  • 1 kudos
gg5
by New Contributor II
  • 2195 Views
  • 4 replies
  • 2 kudos

Resolved! Unable to Access Delta View from Azure Machine Learning via Delta Sharing – Is View Access Supported

Unable to Access Delta View from Azure Machine Learning via Delta Sharing – Is View Access Supported?I am able to access the tables but while accessing the view I am getting below error.Response from server: { 'details': [ { '@type': 'type.googleapis...

  • 2195 Views
  • 4 replies
  • 2 kudos
Latest Reply
ericwang52
New Contributor II
  • 2 kudos

View sharing is supported (launched GA) in Databricks. See https://docs.databricks.com/aws/en/delta-sharing/create-share#add-views-to-a-share. You likely need a workspace id override. Creating the recipient from a workspace with proper access and res...

  • 2 kudos
3 More Replies
juandados
by New Contributor
  • 228 Views
  • 1 replies
  • 0 kudos

GenAI experiment tracing does not render markdown images

When traces include base64 encoded images in Markdown, they do not render properly. This makes the analysis of traces including images difficult.Just for context, the same trace in other tracing tools like LangSmith renders as expected. An example of...

not expanded.png expanded.png
  • 228 Views
  • 1 replies
  • 0 kudos
Latest Reply
sarahbhord
Databricks Employee
  • 0 kudos

Thank you for the for the flag juandados! I will ping my product team to get a timeline for you.

  • 0 kudos
ostae911
by New Contributor
  • 758 Views
  • 1 replies
  • 1 kudos

AutoML Forecast fails when using feature_store_lookups with timestamp key

We are running AutoML Forecast on Databricks Runtime 15.4 ML LTS and 16.4 ML LTS, using a time series dataset with temporal covariates from the Feature Store (e.g. a corona_dummy feature). We use feature_store_lookups with lookup_key and timestamp_lo...

  • 758 Views
  • 1 replies
  • 1 kudos
Latest Reply
jamesl
Databricks Employee
  • 1 kudos

Hi @ostae911 , are you still facing this issue? It looks like your usage of the timestamp column is correct. It can be used as a primary key on the time series feature table. Is it possible that there are other duplicate columns between the training ...

  • 1 kudos
prashant_089
by New Contributor II
  • 1417 Views
  • 3 replies
  • 1 kudos

Resolved! Serving Endpoint Disappears After One Day

I'm encountering an issue where a serving endpoint I create disappears from the list of serving endpoints after a day. This has happened both when I created the endpoint from the Databricks UI and using the Databricks SDK.

  • 1417 Views
  • 3 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Hey @prashant_089 , what you are experiencing should not happen on its own except for some extremely outlying circumstanctes. IF YOU ARE USING Databricks Free Edition you shold ignore everything below. Here are some troubleshooting suggestions/tips: ...

  • 1 kudos
2 More Replies
AmineM
by New Contributor II
  • 2211 Views
  • 3 replies
  • 0 kudos

Resolved! Problem loading a pyfunc model in job run

Hi, I'm currently working on a automated job to predict forecasts using a notebook than work just fine when I run it manually, but keep failling when schedueled, here is my code: import mlflow # Load model as a PyFuncModel. loaded_model = mlflow.pyf...

  • 2211 Views
  • 3 replies
  • 0 kudos
Latest Reply
sarahbhord
Databricks Employee
  • 0 kudos

Hey AmineM! If your MLflow model loads fine in a Databricks notebook but fails in a scheduled job on serverless compute with an error like:   TypeError: code() argument 13 must be str, not int   the root cause is almost always a mismatch between the ...

  • 0 kudos
2 More Replies
excavator-matt
by New Contributor III
  • 1048 Views
  • 4 replies
  • 2 kudos

Resolved! What is the most efficient way of running sentence-transformers on a Spark DataFrame column?

We're trying to run the bundled sentence-transformers library from SBert in a notebook running Databricks ML 16.4 on an AWS g4dn.2xlarge [T4] instance.However, we're experiencing out of memory crashes and are wondering what the optimal to run sentenc...

Machine Learning
memory issues
sentence-transformers
vector embeddings
  • 1048 Views
  • 4 replies
  • 2 kudos
Latest Reply
jamesl
Databricks Employee
  • 2 kudos

If you didn't get this to work with Pandas API on Spark, you might also try importing and instantiating the SentenceTransformer model inside the pandas UDF for proper distributed execution. Each executor runs code independently, and when Spark execut...

  • 2 kudos
3 More Replies
salesbrj
by New Contributor
  • 263 Views
  • 1 replies
  • 0 kudos

Inference Tables Empty

Hello,I have been using Databricks Free Platform for a while. Everything seems to work well. However, I've been trying to generate the payload from the deployed endpoint and I got always an empty inference table.When I check the configuration, I got ...

salesbrj_0-1759590791882.png
  • 263 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @salesbrj ,Most probably this will be related to limitation in Free Edition. In limitations section I can see following entry:No custom models on GPU or batch inferencehttps://docs.databricks.com/aws/en/getting-started/free-edition-limitations

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels