cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Science & Machine Learning

Forum Posts

abd
by Contributor
  • 476 Views
  • 2 replies
  • 0 kudos

Error - Langchain to interact with a SQL database

I am using databricks community edition to use langchain on SQL database in databricks.I am following this link: Interact with SQL database - DatabricksBut I am facing issue on this line: db = SQLDatabase.from_databricks(catalog="samples", schema="ny...

Machine Learning
Connection
Database
langchain
sql
  • 476 Views
  • 2 replies
  • 0 kudos
Latest Reply
KumaranT
New Contributor III
  • 0 kudos

Hi @abd,Can you check upgrading the SQL driver?

  • 0 kudos
1 More Replies
espartaco
by New Contributor
  • 366 Views
  • 1 replies
  • 0 kudos

MLflow autolging is not registering my experiments

When training a any ML model in a Databricks notebook, after calling model.fit() and train the model, before the model was automatically saved, but now is giving me this error:WARNING mlflow.utils.autologging_utils: Encountered unexpected error durin...

  • 366 Views
  • 1 replies
  • 0 kudos
Latest Reply
KumaranT
New Contributor III
  • 0 kudos

Hi @espartaco,The error message shows that there's an issue with SSL certificate verification when trying to connect to the Azure storage endpointCheck network and firewall configurations: You need to ensure that the network and firewall configuratio...

  • 0 kudos
fh
by New Contributor
  • 290 Views
  • 2 replies
  • 0 kudos

Applyinpandas executed twice

Hi,I have a dataframe containing records (sales) over time for +- 1000 different items, so based on these records each item has its own timeseries. The goal is to make predictions for each of these items. Since the behaviour of these items is very di...

  • 290 Views
  • 2 replies
  • 0 kudos
Latest Reply
KumaranT
New Contributor III
  • 0 kudos

Hi @fh ,To avoid this double execution, you can try using the concurrent.futures module in Python to parallelize the training of your models. This module provides a high-level interface for asynchronously executing callables.

  • 0 kudos
1 More Replies
acdello
by New Contributor
  • 300 Views
  • 2 replies
  • 0 kudos

Databricks documentation for training a local LLM

Im in the process of training a chat-bot for my team to use to learn about databricks and relevant tools quickly. Is there a place that I can easily (and legally) grab learning material in PDF or text? 

  • 300 Views
  • 2 replies
  • 0 kudos
Latest Reply
KumaranT
New Contributor III
  • 0 kudos

Hi @acdello,Could you check this doc if that helps in between?

  • 0 kudos
1 More Replies
chagoo
by New Contributor
  • 232 Views
  • 1 replies
  • 0 kudos

error tu run btyd model

I run the model in april and ok but today I need run the model and I have error and it is not possible continue I change the penalizer_coef and nothing # fit a model with a larger penalizer coefficientbgf_engagement = BetaGeoFitter(penalizer_coef=100...

  • 232 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @chagoo,To fix this, try lowering the penalizer coefficient, checking the data quality for anomalies, scaling the data, increasing the number of iterations, or experimenting with different initial parameters. These steps should help resolve the co...

  • 0 kudos
EijayK
by New Contributor
  • 360 Views
  • 1 replies
  • 0 kudos

Debugging using vscode & databricks connect

Hi allI'm facing some difficulties when I use DataBricks Connect to debug my ML solution. A long story short, I want to investigate a few variables after I've conducted training. With the debugger at hand, I can simply place a breakpoint on the line ...

  • 360 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @EijayK, Ensure that the package is installed on the cluster itself, which you can verify through the cluster's library installation logs. Additionally, make sure your cluster meets all Databricks Connect requirements, including proper configurati...

  • 0 kudos
Kjetil
by New Contributor III
  • 253 Views
  • 2 replies
  • 2 kudos

Feature Store - lookback_window does not work with primary keys of "date" type

I just discovered what I believe is a bug in Feature Store. The expected value (of the "value" column) is 'NULL' but the actual value is "a". If I instead change the format to timestamp of the "date" column (i.e. removes the .date() in the generation...

  • 253 Views
  • 2 replies
  • 2 kudos
Latest Reply
Kjetil
New Contributor III
  • 2 kudos

Thank you for answering. Yes, that is also what I figured out. In other words the lookback_window argument only works when using timestamp format for the primary key. I cannot see that this behavior is described in the documentation.

  • 2 kudos
1 More Replies
yorabhir
by New Contributor III
  • 538 Views
  • 3 replies
  • 2 kudos

Resolved! How to search the run id of an experiment run created in another notebook?

Hello,I have created an experiment using with mlflow.start_run(run_name='experment_1'):in a notebook say 'notebook_1'.  In the 'Experiments' tab if I click on 'notebook_1', I am able to see 'experiment_1'. Now I am trying to search the experiment in ...

  • 538 Views
  • 3 replies
  • 2 kudos
Latest Reply
yorabhir
New Contributor III
  • 2 kudos

Thank you @atmcqueen , the solution is working.

  • 2 kudos
2 More Replies
TSchmidt
by New Contributor
  • 338 Views
  • 1 replies
  • 0 kudos

large scale yolo inference

I have 50 Million Images sitting on s3 I have a Yolov8 model trained with ultralytics and want to run inference on those images. I suspect I should be running inference using ML flow, but I am confused on how. I don't need to track experiments/traini...

  • 338 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @TSchmidt, To efficiently run inference on your 50 million images stored in S3 using a trained YOLOv8 model from Ultralytics, start by downloading your model from S3 and loading it locally. Use the `boto3` library to list images in your S3 bucket ...

  • 0 kudos
tiho
by New Contributor
  • 1810 Views
  • 5 replies
  • 2 kudos

Vector Search Index Sync fails in Initializing

Vector Search Index Sync fails in Initializing. This index table was already up and running, and when I tried to sync it, it failed in Initializing. See the attached.  

tiho_0-1709733181256.png
  • 1810 Views
  • 5 replies
  • 2 kudos
Latest Reply
jnkthms
New Contributor III
  • 2 kudos

The issue for us was most likely that we used CPU compute for the deployed embedding model, switching to GPU (small) solved the issue. 

  • 2 kudos
4 More Replies
jnkthms
by New Contributor III
  • 557 Views
  • 3 replies
  • 0 kudos

Resolved! Initializing Vector Search index Sync failes with Failed to resolve flow: '__online_index_view'

When setting up a vector search in databricks using the bge_m3 (Version 1) embedding model available in system.ai schema, the setup runs for 20 minutes or so and then fails. Querying the served embedding models from the browser works perfectly fine. ...

  • 557 Views
  • 3 replies
  • 0 kudos
Latest Reply
jnkthms
New Contributor III
  • 0 kudos

The issue was most likely to use a CPU compute for the deployed model, switching to GPU (small) solved the issue. 

  • 0 kudos
2 More Replies
ledsouza
by New Contributor
  • 243 Views
  • 1 replies
  • 0 kudos

Community Edition workspace not found

Suddenly got logout from my account in the Community Edition. When I tried to login again, I received this error message: "We were not able to find a Community Edition workspace with this email. Please login to accounts.cloud.databricks.com to find t...

  • 243 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @ledsouza, Thank you for contacting Databricks Community Discussion Forum.   Please note that for any issues related to the Databricks Community Edition product, you can find helpful resources here. If you encounter any difficulties beyond what's ...

  • 0 kudos
NaeemS
by New Contributor III
  • 2270 Views
  • 9 replies
  • 0 kudos

Feature Store Model Serving endpoint

Hi,I am trying to deploy my model which was logged by featureStoreEngineering client as a serving endpoint in Databricks. But I am facing following error:   The Databricks Lookup client from databricks-feature-lookup and Databricks Feature Store clie...

  • 2270 Views
  • 9 replies
  • 0 kudos
Latest Reply
robbe
New Contributor III
  • 0 kudos

Hi @damselfly20 unfortunately I can't help much with that as I've never worked with RAGs. Are you sure it's the same error though? @NaeemS's and my errors seems to be Java related and yours MLflow related.

  • 0 kudos
8 More Replies
RobinK
by Contributor
  • 501 Views
  • 2 replies
  • 1 kudos

Resolved! Vectorsearch ConnectionResetError Max retries exceeded

Hi,we are serving a unity catalog langchain model with databricks model serving. When I run the predict() function on the model in a notebook, I get the expected output. But when I query the served model, errors occur in the service logs:Error messag...

  • 501 Views
  • 2 replies
  • 1 kudos
Latest Reply
RobinK
Contributor
  • 1 kudos

downgrading langchain-community to version 0.2.4 solved my problem.

  • 1 kudos
1 More Replies
Kash
by Contributor III
  • 1273 Views
  • 2 replies
  • 1 kudos

Building a Data Quality pipeline with alerting

Hi there,My question is how do we setup a data-quality pipeline with alerting?Background: We would like to setup a data-quality pipeline to ensure the data we collect each day is consistent and complete. We will use key metrics found in our bronze JS...

  • 1273 Views
  • 2 replies
  • 1 kudos
Latest Reply
joarobles
New Contributor III
  • 1 kudos

Hi Kash!I know it might be too late, but if you managed to create this by yourself and you are struggling to scale the solution you could take a look at Rudol Data Quality, it covers up pretty much everything you mentioned with a focus on enabling no...

  • 1 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels