cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

isaac_gritz
by Databricks Employee
  • 7710 Views
  • 1 replies
  • 3 kudos

Resolved! Pricing on Databricks

How Pricing Works on DatabricksI highly recommend checking out this blog post on how databricks pricing works from my colleague @MENDELSOHN CHAN​Databricks has a consumption based pricing model, so you pay only for the compute you use.For interactive...

  • 7710 Views
  • 1 replies
  • 3 kudos
Latest Reply
Meag
New Contributor III
  • 3 kudos

I read the read blog you will share it helps thanks for sharing.

  • 3 kudos
Santhanalakshmi
by New Contributor II
  • 5796 Views
  • 3 replies
  • 0 kudos

Throwing IndexoutofBound Exception in Pyspark

Hello All,I am trying to read the data and trying to group the data in order to pass it to predict function via @F.pandas_udf method.#Loading Model pkl_model = pickle.load(open(filepath,'rb'))   # build schema for output labels filter_schema=[] ...

error_db error_2_db error_3_db
  • 5796 Views
  • 3 replies
  • 0 kudos
Latest Reply
Vindhya
New Contributor II
  • 0 kudos

@Santhanalakshmi Manoharan​  Was this issue resolved, Am also getting same error, any guidance would be of great help.Appreciate your help.

  • 0 kudos
2 More Replies
its-kumar
by Databricks Partner
  • 12161 Views
  • 2 replies
  • 0 kudos

MLFlow Remote model registry connection is not working in Databricks

Dear community,I am having multiple Databricks workspaces in my azure subscription, and I have one central workspace. I want to use the central workspace for model registry and experiments tracking from the multiple other workspaces.So, If I am train...

  • 12161 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Kumar Shanu​ :The error you are seeing (API request to endpoint /api/2.0/mlflow/runs/create failed with error code 404 != 200) suggests that the API endpoint you are trying to access is not found. This could be due to several reasons, such as incorr...

  • 0 kudos
1 More Replies
karthik_p
by Databricks Partner
  • 5173 Views
  • 6 replies
  • 2 kudos

when we are trying to create folder/file or list file using dbutils we are getting forbidden error in aws

HI Team,we have created new premium workspace with custom managed vpc, workspace deployed successfully in AWS. we are trying to create folder in dbfs, we are getting below error. we have compared cross account custom managed role (Customer-managed VP...

  • 5173 Views
  • 6 replies
  • 2 kudos
Latest Reply
karthik_p
Databricks Partner
  • 2 kudos

@Debayan Mukherjee​ Issue resolved, looks cloud team have not updated required security groups that has been shared, after revisiting them we are able to find missing security groups and added them

  • 2 kudos
5 More Replies
ammarchalifah
by New Contributor
  • 4460 Views
  • 1 replies
  • 0 kudos

DeltaFileNotFoundException in a multi cluster conflict

I have several parallel data pipeline running in different Airflow DAGs. All of these pipeline execute two dbt selectors in a dedicated Databricks cluster: one of them is a common selector executed in all DAGs. This selector includes a test that is d...

  • 4460 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Ammar Ammar​ :The error message you're seeing suggests that the Delta Lake transaction log for the common model's test table has been truncated or deleted, either manually or due to the retention policies set in your cluster. This can happen if the ...

  • 0 kudos
DK
by New Contributor II
  • 2820 Views
  • 1 replies
  • 1 kudos

Unable to call logged ML model from a different notebook when using Spark ML

Hi, I am a R user and I am experimenting to build an ml model with R and with spark flavoured algorithms in Databricks. However, I am struggling to call a model that is logged as part of the experiment from a different notebook when I use spark flavo...

  • 2820 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Dip Kundu​ :It seems like the error you are facing is related to sparklyr, which is used to interact with Apache Spark from R, and not directly related to mlflow. The error message suggests that an object could not be found, but it's not clear which...

  • 1 kudos
Anonymous
by Not applicable
  • 2555 Views
  • 1 replies
  • 1 kudos

Hive Catalog DDL, describe extended returns "... n more fields" when detailing a many column array<struct<

I am using Hackolade data modelling tool to reverse engineer (using cluster connection) deployed databases and their table and view definitions.Some of our tables contain large multi-column structs, and these can only be partially described as a char...

  • 2555 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Yes, it is possible to configure the Hive Catalog in Databricks to return full descriptions of tables with large multi-column structs.One way to achieve this is to increase the value of the Hive configuration property "hive.metastore.client.record.ma...

  • 1 kudos
thomasm
by New Contributor III
  • 6511 Views
  • 3 replies
  • 1 kudos

Resolved! Online Feature Store MLflow serving problem

When I try to serve a model stored with FeatureStoreClient().log_model using the feature-store-online-example-cosmosdb tutorial Notebook, I get errors suggesting that the primary key schema is not configured properly. However, if I look in the Featur...

  • 6511 Views
  • 3 replies
  • 1 kudos
Latest Reply
NandiniN
Databricks Employee
  • 1 kudos

Hello @Thomas Michielsen​ , this error seems to occur when you may have created the table yourself. You must use publish_table() to create the table in the online store. Do not manually create a database or container inside Cosmos DB. publish_table()...

  • 1 kudos
2 More Replies
lurban
by Databricks Partner
  • 4646 Views
  • 1 replies
  • 0 kudos

CloudFilesIllegalStateException: Found mismatched event: key old_file_path doesn't have the prefix: new_file_path

My team currently uses Autoloader and Delta Live Tables to process incremental data from ADLS storage. We are needing to keep the same table and history, but switch the filepath to a different location in storage. When I test a filepath change, I rec...

  • 4646 Views
  • 1 replies
  • 0 kudos
Latest Reply
DD_Sharma
Databricks Employee
  • 0 kudos

Autoloader doesn't support changing the source path for running job so if you change your source path your stream fails because the source path has changed. However, if you really want to change the path you can change it by using the new checkpoint ...

  • 0 kudos
ryojikn
by New Contributor III
  • 8275 Views
  • 2 replies
  • 0 kudos

How to use spark-submit python task with the usage of --archives parameter passing a .tar.gz conda env?

We've been trying to launch a spark-submit python task using the parameter "archives", similar to that one used in Yarn.​However, we've not been able to successfully make it work in databricks.​​We know that for our OnPrem installation we can use som...

  • 8275 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Ryoji Kuwae Neto​ :To use the --archives parameter with a conda environment in Databricks, you can follow these steps:1) Create a conda environment for your project and export it as a .tar.gz file:conda create --name myenv conda activate myenv conda...

  • 0 kudos
1 More Replies
Vish1
by New Contributor II
  • 11261 Views
  • 3 replies
  • 1 kudos

pyspark: Stage failure due to One hot encoding

I am facing the below error while fitting my model. I am trying to run a model with cross validation with a pipeline inside of it. Below is the code snippet for data transformation:qd = QuantileDiscretizer(relativeError=0.01, handleInvalid="error", n...

image
  • 11261 Views
  • 3 replies
  • 1 kudos
Latest Reply
shyam_9
Databricks Employee
  • 1 kudos

Hi @Vishnu P​, could you please share the full stack trace? Also, observe how the workers memory utilizing?

  • 1 kudos
2 More Replies
Cristianmarja
by New Contributor
  • 1263 Views
  • 1 replies
  • 0 kudos

Hi everyone,Please note that I stuck with exercise 2.0 Train and Validate ML Model because when I run code appear a NameError with the following label...

Hi everyone,Please note that I stuck with exercise 2.0 Train and Validate ML Model because when I run code appear a NameError with the following label: name 'DoubleType' is not defined.I put the code bellow for your reference.I would like any help ab...

  • 1263 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Cristian Martinez​ :The error you are seeing is occurring because the DoubleType class has not been imported. To fix this, add the following line to the top of your code to import DoubleType:from pyspark.sql.types import DoubleTypeThis should resolv...

  • 0 kudos
invalidargument
by New Contributor III
  • 1657 Views
  • 1 replies
  • 0 kudos

Model storage requirements management

Hi.We have around 30 models in model storage that we use for batch scoring. These are created at different times by different person and on different cluster run times.Now we have run into problems that we can't de-serialize the models and use for in...

  • 1657 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Jonas Lindberg​ :To address the issues you are facing with model serialization and versioning, I would recommend the following approach:Use MLflow to manage the lifecycle of your models, including versioning, deployment, and monitoring. MLflow is an...

  • 0 kudos
Cristianmarja
by New Contributor
  • 1755 Views
  • 1 replies
  • 0 kudos

2.0 Train and Validate ML Model - Exercise / Double Type is not defined

Hi everyone,Please note that I stuck with exercise 2.0 Train and Validate ML Model because when I run code appear a NameError with the following label: name 'DoubleType' is not defined.I would like any help about this subject.

  • 1755 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Cristian Martinez​ :In Databricks, you need to import the necessary classes from the pyspark.sql.types module in order to use them in your code. To fix the NameError you're encountering with the label "name 'DoubleType' is not defined" in Exercise 2...

  • 0 kudos
Orianh
by Valued Contributor II
  • 4174 Views
  • 1 replies
  • 2 kudos

MLflow log pytorch distributed training

Hey Guys,I have few question that i hope you can help me with.I start to train pytorch model in distributed training using petastorm + Horovod like databricks suggest in docs.Q 1:I can see that each worker is train the model, but when epochs are done...

  • 4174 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@orian hindi​ :Regarding your questions:Q1: The error message you are seeing is likely related to a segmentation fault, which can occur due to various reasons such as memory access violations or stack overflows. It could be caused by several factors,...

  • 2 kudos
Labels