cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Anand_Ladda
by Honored Contributor II
  • 1036 Views
  • 2 replies
  • 0 kudos

Resolved! How do I use the Copy Into command to copy data into a Delta Table? Looking for examples where you want to have a pre-defined schema

I've reviewed the COPY INTO docs here - https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-copy-into.html#examples but there's only one simple example. Looking for some additional examples that show loading data from CSV - with ...

  • 1036 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anand_Ladda
Honored Contributor II
  • 0 kudos

Here's an example for predefined schemaUsing COPY INTO with a predefined table schema – Trick here is to CAST the CSV dataset into your desired schema in the select statement of COPY INTO. Example below%sql CREATE OR REPLACE TABLE copy_into_bronze_te...

  • 0 kudos
1 More Replies
Abdurrahman
by New Contributor
  • 108 Views
  • 2 replies
  • 0 kudos

How to download a pytorch model created via notebook and saved in a folder ?

I have created a pytorch model using databricks notebooks and saved it in a folder in workspace. MLFlow is not used.When I try to download the files from the folder it exceeds the download limit. Is there a way to download the model locally into my s...

  • 108 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Abdurrahman,  If you know the direct URL of the pretrained PyTorch model, you can use wget or a Python script to download it directly to your local system.For example, if you want to download the pretrained ResNet-18 model, you can use the follow...

  • 0 kudos
1 More Replies
marcelo2108
by Contributor
  • 2350 Views
  • 23 replies
  • 0 kudos

Problem when serving a langchain model on Databricks

I´m trying to model serving a LLM LangChain Model and every time it fails with this messsage:[6b6448zjll] [2024-02-06 14:09:55 +0000] [1146] [INFO] Booting worker with pid: 1146[6b6448zjll] An error occurred while loading the model. You haven't confi...

  • 2350 Views
  • 23 replies
  • 0 kudos
Latest Reply
BigNaN
Visitor
  • 0 kudos

I followed the example in dbdemos 02-Deploy-RAG-Chatbot to deploy a simple joke-generating chain, no RAG or anything. Querying the endpoint produced error "You haven\\'t configured the CLI yet!..." (screenshot 1.) The solution was to add 2 environmen...

  • 0 kudos
22 More Replies
BogdanV
by New Contributor III
  • 180 Views
  • 3 replies
  • 0 kudos

Resolved! Query ML Endpoint with R and Curl

I am trying to get a prediction by querying the ML Endpoint on Azure Databricks with R. I'm not sure what is the format of the expected data. Is there any other problem with this code? Thanks!!! 

R Code.png
  • 180 Views
  • 3 replies
  • 0 kudos
Latest Reply
BogdanV
New Contributor III
  • 0 kudos

Hi Kaniz, I was able to find the solution. You should post this in the examples when you click "Query Endpoint"You only have code for Browser, Curl, Python, SQL. You should add a tab for RHere is the solution:library(httr)url <- "https://adb-********...

  • 0 kudos
2 More Replies
VJ3
by New Contributor III
  • 56 Views
  • 2 replies
  • 0 kudos

Security Controls to implement on Machine Learning Persona

Hello,Hope everyone are doing well. You may be aware that we are using Table ACL enabled cluster to ensure the adequate security controls on Databricks. You may be also aware that we can not use Table enabled ACL cluster on Machine Learning Persona. ...

  • 56 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @VJ3, Databricks is a powerful platform that combines data engineering, machine learning, and business intelligence. When deploying Databricks in an enterprise environment, it’s crucial to establish robust security practices. Let’s focus on best ...

  • 0 kudos
1 More Replies
G-M
by Contributor
  • 80 Views
  • 1 replies
  • 0 kudos

MLflow Experiments in Unity Catalog

Will MLflow Experiments be incorporated into Unity Catalog similar to models and feature tables? I feel like this is the final piece missing in a comprehensive Unity Catalog backed MLOps workflow. Currently it seems they can only be stored in a dbfs ...

  • 80 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @G-M,  While Models in Unity Catalog cover model registration and management, MLflow Experiments focus on experiment tracking, versioning, and metrics.Currently, MLflow Experiments are stored in a DBFS-backed location (Databricks File System), whi...

  • 0 kudos
larsr
by New Contributor
  • 80 Views
  • 1 replies
  • 0 kudos

DBR CLI v0.216.0 failed to pass bundle variable for notebook task

After installing the new version of the CLI (v0.216.0) the bundle variable for the notebook task is not parsed correctly, see code below:tasks:        - task_key: notebook_task          job_cluster_key: job_cluster          notebook_task:            ...

Machine Learning
asset bundles
  • 80 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @larsr,  Ensure that the variable ${var.notebook_path} is correctly defined and accessible within the context of your bundle configuration. Sometimes, scoping issues can lead to variable references not being resolved properly.

  • 0 kudos
johnp
by New Contributor II
  • 442 Views
  • 2 replies
  • 0 kudos

pdb debugger on databricks

I am new to databricks. and trying to debug my python application with variable-explore by following the instruction from: https://www.databricks.com/blog/new-debugging-features-databricks-notebooks-variable-explorerI added the "import pdb" in the fi...

  • 442 Views
  • 2 replies
  • 0 kudos
Latest Reply
johnp
New Contributor II
  • 0 kudos

I test with some simple applications, it works as you described.  However, the application I am debugging uses the pyspark structured streaming, which runs continuously. After inserting pdb.set_trace(), the application paused at the breakpoint, but t...

  • 0 kudos
1 More Replies
Sujitha
by Community Manager
  • 2854 Views
  • 15 replies
  • 10 kudos

Featured Member Interview - March 2023Name: Ajay Pandey ⭐️ Community nickname: @Ajay Pandey​ Pronouns:  He/HimCompany: Celebal Technologies PVT LTD , ...

Featured Member Interview - March 2023Name: Ajay Pandey ️Community nickname: @Ajay Pandey​ Pronouns:  He/HimCompany: Celebal Technologies PVT LTD , Jaipur Rajasthan, IndiaJob Title: Associate Consultant - Data EngineerDatabricks CertificationsDatabri...

  • 2854 Views
  • 15 replies
  • 10 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 10 kudos

Congrats Ajay

  • 10 kudos
14 More Replies
Octavian1
by Contributor
  • 92 Views
  • 2 replies
  • 0 kudos

port undefined error in SQLDatabase.from_databricks (langchain.sql_database)

The following assignment:from langchain.sql_database import SQLDatabasedbase = SQLDatabase.from_databricks(catalog=catalog, schema=db,host=host, api_token=token,)fails with ValueError: invalid literal for int() with base 10: ''because ofcls._assert_p...

  • 92 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Octavian1, Ensure that the port parameter you’re passing to SQLDatabase.from_databricks is a valid integer. If it’s empty or contains non-numeric characters, that could be the root cause. In a Stack Overflow post, someone faced a similar issue wh...

  • 0 kudos
1 More Replies
kng88
by New Contributor II
  • 1169 Views
  • 6 replies
  • 7 kudos

How to save model produce by distributed training?

I am trying to save model after distributed training via the following codeimport sys   from spark_tensorflow_distributor import MirroredStrategyRunner   import mlflow.keras   mlflow.keras.autolog()   mlflow.log_param("learning_rate", 0.001)   import...

  • 1169 Views
  • 6 replies
  • 7 kudos
Latest Reply
Xiaowei
New Contributor III
  • 7 kudos

I think I finally worked this out.Here is the extra code to save out the model only once and from the 1st node:context = pyspark.BarrierTaskContext.get() if context.partitionId() == 0: mlflow.keras.log_model(model, "mymodel")

  • 7 kudos
5 More Replies
yorabhir
by New Contributor II
  • 153 Views
  • 1 replies
  • 1 kudos

Resolved! 'error_code': 'INVALID_PARAMETER_VALUE', 'message': 'Too many sources. It cannot be more than 100'

I am getting the following error while saving a delta table in the feature storeWARNING databricks.feature_store._catalog_client_helper: Failed to record data sources in the catalog. Exception: {'error_code': 'INVALID_PARAMETER_VALUE', 'message': 'To...

  • 153 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @yorabhir,  Verify how many sources you’re trying to record in the catalog. If it exceeds 100, you’ll need to reduce the number of sources.Ensure that the feature table creation process is correctly configured. In your code snippet, you’re creatin...

  • 1 kudos
MaKarenina
by New Contributor
  • 254 Views
  • 1 replies
  • 0 kudos

ML Flow until January 24

Hi! When i was creating a new endpoint a have this alert  CREATE A MODEL SERVING ENDPOINT TO SERVE YOUR MODEL BEHIND A REST API INTERFACE. YOU CAN STILL USE LEGACY ML FLOW MODEL SERVING UNTIL JANUARY 2024 I don't understand if my Legacy MLFlow Model ...

  • 254 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @MaKarenina, The alert you received states that you can continue using Legacy MLflow Model Serving until January 2024. However, there are a few important points to consider: Support: After January 2024, Legacy MLflow Model Serving will no lon...

  • 0 kudos
Alessandro
by New Contributor
  • 204 Views
  • 1 replies
  • 0 kudos

using openai Api in Databricks without iterating rows

 Hi to everyone,I have a delta table with a column 'comment' I would like to add a new column 'sentiment', and I would like to calculate it using openai API.I already know how to create a databricks endpoint to an external model and how to use it (us...

  • 204 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Alessandro, Your question is clear, and I appreciate your curiosity about optimizing the process. Let’s explore a couple of approaches: UDF (User-Defined Function): You can create a UDF in Databricks that invokes the OpenAI API for sentiment...

  • 0 kudos