cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Santhanalakshmi
by New Contributor II
  • 1458 Views
  • 3 replies
  • 0 kudos

Throwing IndexoutofBound Exception in Pyspark

Hello All,I am trying to read the data and trying to group the data in order to pass it to predict function via @F.pandas_udf method.#Loading Model pkl_model = pickle.load(open(filepath,'rb'))   # build schema for output labels filter_schema=[] ...

error_db error_2_db error_3_db
  • 1458 Views
  • 3 replies
  • 0 kudos
Latest Reply
Vindhya
New Contributor II
  • 0 kudos

@Santhanalakshmi Manoharan​  Was this issue resolved, Am also getting same error, any guidance would be of great help.Appreciate your help.

  • 0 kudos
2 More Replies
its-kumar
by New Contributor III
  • 2846 Views
  • 2 replies
  • 0 kudos

MLFlow Remote model registry connection is not working in Databricks

Dear community,I am having multiple Databricks workspaces in my azure subscription, and I have one central workspace. I want to use the central workspace for model registry and experiments tracking from the multiple other workspaces.So, If I am train...

  • 2846 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Kumar Shanu​ :The error you are seeing (API request to endpoint /api/2.0/mlflow/runs/create failed with error code 404 != 200) suggests that the API endpoint you are trying to access is not found. This could be due to several reasons, such as incorr...

  • 0 kudos
1 More Replies
Spencer_Kent
by New Contributor III
  • 1167 Views
  • 2 replies
  • 1 kudos

Resolved! Lacking support for column-level select grants or attribute-based access control

In the Unity Catalog launch and its accompanying blog post, one of the primary selling points was a set of granular access control features that would at least partially eliminate the need to create a multitude of separate table views and the attenda...

  • 1167 Views
  • 2 replies
  • 1 kudos
Latest Reply
Spencer_Kent
New Contributor III
  • 1 kudos

Simply amazing that 2 years on from the initial announcement, this feature is not available. You released Unity Catalog missing one of it's most-hyped features.

  • 1 kudos
1 More Replies
karthik_p
by Esteemed Contributor
  • 1688 Views
  • 6 replies
  • 2 kudos

when we are trying to create folder/file or list file using dbutils we are getting forbidden error in aws

HI Team,we have created new premium workspace with custom managed vpc, workspace deployed successfully in AWS. we are trying to create folder in dbfs, we are getting below error. we have compared cross account custom managed role (Customer-managed VP...

  • 1688 Views
  • 6 replies
  • 2 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 2 kudos

@Debayan Mukherjee​ Issue resolved, looks cloud team have not updated required security groups that has been shared, after revisiting them we are able to find missing security groups and added them

  • 2 kudos
5 More Replies
ammarchalifah
by New Contributor
  • 2055 Views
  • 1 replies
  • 0 kudos

DeltaFileNotFoundException in a multi cluster conflict

I have several parallel data pipeline running in different Airflow DAGs. All of these pipeline execute two dbt selectors in a dedicated Databricks cluster: one of them is a common selector executed in all DAGs. This selector includes a test that is d...

  • 2055 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Ammar Ammar​ :The error message you're seeing suggests that the Delta Lake transaction log for the common model's test table has been truncated or deleted, either manually or due to the retention policies set in your cluster. This can happen if the ...

  • 0 kudos
DK
by New Contributor II
  • 885 Views
  • 1 replies
  • 1 kudos

Unable to call logged ML model from a different notebook when using Spark ML

Hi, I am a R user and I am experimenting to build an ml model with R and with spark flavoured algorithms in Databricks. However, I am struggling to call a model that is logged as part of the experiment from a different notebook when I use spark flavo...

  • 885 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Dip Kundu​ :It seems like the error you are facing is related to sparklyr, which is used to interact with Apache Spark from R, and not directly related to mlflow. The error message suggests that an object could not be found, but it's not clear which...

  • 1 kudos
Anonymous
by Not applicable
  • 852 Views
  • 1 replies
  • 1 kudos

Hive Catalog DDL, describe extended returns "... n more fields" when detailing a many column array<struct<

I am using Hackolade data modelling tool to reverse engineer (using cluster connection) deployed databases and their table and view definitions.Some of our tables contain large multi-column structs, and these can only be partially described as a char...

  • 852 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Yes, it is possible to configure the Hive Catalog in Databricks to return full descriptions of tables with large multi-column structs.One way to achieve this is to increase the value of the Hive configuration property "hive.metastore.client.record.ma...

  • 1 kudos
thomasm
by New Contributor II
  • 1675 Views
  • 3 replies
  • 1 kudos

Resolved! Online Feature Store MLflow serving problem

When I try to serve a model stored with FeatureStoreClient().log_model using the feature-store-online-example-cosmosdb tutorial Notebook, I get errors suggesting that the primary key schema is not configured properly. However, if I look in the Featur...

  • 1675 Views
  • 3 replies
  • 1 kudos
Latest Reply
NandiniN
Valued Contributor II
  • 1 kudos

Hello @Thomas Michielsen​ , this error seems to occur when you may have created the table yourself. You must use publish_table() to create the table in the online store. Do not manually create a database or container inside Cosmos DB. publish_table()...

  • 1 kudos
2 More Replies
lurban
by New Contributor
  • 629 Views
  • 1 replies
  • 0 kudos

CloudFilesIllegalStateException: Found mismatched event: key old_file_path doesn't have the prefix: new_file_path

My team currently uses Autoloader and Delta Live Tables to process incremental data from ADLS storage. We are needing to keep the same table and history, but switch the filepath to a different location in storage. When I test a filepath change, I rec...

  • 629 Views
  • 1 replies
  • 0 kudos
Latest Reply
DD_Sharma
New Contributor III
  • 0 kudos

Autoloader doesn't support changing the source path for running job so if you change your source path your stream fails because the source path has changed. However, if you really want to change the path you can change it by using the new checkpoint ...

  • 0 kudos
ryojikn
by New Contributor III
  • 2763 Views
  • 2 replies
  • 0 kudos

How to use spark-submit python task with the usage of --archives parameter passing a .tar.gz conda env?

We've been trying to launch a spark-submit python task using the parameter "archives", similar to that one used in Yarn.​However, we've not been able to successfully make it work in databricks.​​We know that for our OnPrem installation we can use som...

  • 2763 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Ryoji Kuwae Neto​ :To use the --archives parameter with a conda environment in Databricks, you can follow these steps:1) Create a conda environment for your project and export it as a .tar.gz file:conda create --name myenv conda activate myenv conda...

  • 0 kudos
1 More Replies
Vish1
by New Contributor II
  • 3245 Views
  • 3 replies
  • 1 kudos

pyspark: Stage failure due to One hot encoding

I am facing the below error while fitting my model. I am trying to run a model with cross validation with a pipeline inside of it. Below is the code snippet for data transformation:qd = QuantileDiscretizer(relativeError=0.01, handleInvalid="error", n...

image
  • 3245 Views
  • 3 replies
  • 1 kudos
Latest Reply
shyam_9
Valued Contributor
  • 1 kudos

Hi @Vishnu P​, could you please share the full stack trace? Also, observe how the workers memory utilizing?

  • 1 kudos
2 More Replies
Cristianmarja
by New Contributor
  • 340 Views
  • 1 replies
  • 0 kudos

Hi everyone,Please note that I stuck with exercise 2.0 Train and Validate ML Model because when I run code appear a NameError with the following label...

Hi everyone,Please note that I stuck with exercise 2.0 Train and Validate ML Model because when I run code appear a NameError with the following label: name 'DoubleType' is not defined.I put the code bellow for your reference.I would like any help ab...

  • 340 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Cristian Martinez​ :The error you are seeing is occurring because the DoubleType class has not been imported. To fix this, add the following line to the top of your code to import DoubleType:from pyspark.sql.types import DoubleTypeThis should resolv...

  • 0 kudos
invalidargument
by New Contributor II
  • 473 Views
  • 1 replies
  • 0 kudos

Model storage requirements management

Hi.We have around 30 models in model storage that we use for batch scoring. These are created at different times by different person and on different cluster run times.Now we have run into problems that we can't de-serialize the models and use for in...

  • 473 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Jonas Lindberg​ :To address the issues you are facing with model serialization and versioning, I would recommend the following approach:Use MLflow to manage the lifecycle of your models, including versioning, deployment, and monitoring. MLflow is an...

  • 0 kudos
Cristianmarja
by New Contributor
  • 437 Views
  • 1 replies
  • 0 kudos

2.0 Train and Validate ML Model - Exercise / Double Type is not defined

Hi everyone,Please note that I stuck with exercise 2.0 Train and Validate ML Model because when I run code appear a NameError with the following label: name 'DoubleType' is not defined.I would like any help about this subject.

  • 437 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Cristian Martinez​ :In Databricks, you need to import the necessary classes from the pyspark.sql.types module in order to use them in your code. To fix the NameError you're encountering with the label "name 'DoubleType' is not defined" in Exercise 2...

  • 0 kudos
Orianh
by Valued Contributor II
  • 1345 Views
  • 1 replies
  • 2 kudos

MLflow log pytorch distributed training

Hey Guys,I have few question that i hope you can help me with.I start to train pytorch model in distributed training using petastorm + Horovod like databricks suggest in docs.Q 1:I can see that each worker is train the model, but when epochs are done...

  • 1345 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@orian hindi​ :Regarding your questions:Q1: The error message you are seeing is likely related to a segmentation fault, which can occur due to various reasons such as memory access violations or stack overflows. It could be caused by several factors,...

  • 2 kudos
Labels