Machine Learning

by qwerty1 • Contributor

04-26-2023 7:31:00 PM

570 Views
1 replies
1 kudos

Resolved! What is the disadvantage of using multiple Z-Order columns?

The documentation statesYou can specify multiple columns for ZORDER BY as a comma-separated list. However, the effectiveness of the locality drops with each extra columnWhat does it mean for "effectiveness of the locality to drop" with each extra co...

Machine Learning

Reply

570 Views
1 replies
1 kudos

04-26-2023 7:31:00 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-28-2023 10:58:19 AM

1 kudos

@Ashwin Bhaskar :Z-ordering is a technique to improve the performance of queries that involve filtering and grouping on specific columns in a large distributed database. When a table is z-ordered on a certain column or set of columns, the data is so...

1 kudos

04-28-2023 10:58:19 AM

by khh2023 • New Contributor

01-25-2023 1:47:52 PM

725 Views
1 replies
0 kudos

Optimize operation with big increase in numRemovedFiles/numRemovedBytes/numAddedFiles/numAddedBytes

Hello, I have a daily loading process for a delta table and has a ‘optimize table’ step at the end. The optimize operation used to take about 5 minutes, but now takes about 3.5 hours. One thing I noticed from 'describe history' is the operationMetric...

Machine Learning

Reply

725 Views
1 replies
0 kudos

01-25-2023 1:47:52 PM

View Replies

Latest Reply

mathan_pillai
Valued Contributor

04-27-2023 2:58:38 PM

0 kudos

This is most likely because more files became eligible for compaction (optimize). By default there is a limit of 50 files or so per partition, below which the partition doesn't qualify for optimize. Only if there are 50+ files within a partition the...

0 kudos

04-27-2023 2:58:38 PM

by Anonymous • Not applicable

04-20-2023 1:49:35 AM

977 Views
3 replies
1 kudos

www.dbdemos.ai

Hurray!! Dolly demo is live now Build your Chat Bot with Dolly now. Experiment and let us know how do you feel about it.https://www.dbdemos.ai/demo.html?demoName=llm-dolly-chatbot

Machine Learning

Reply

977 Views
3 replies
1 kudos

04-20-2023 1:49:35 AM

View Replies

Latest Reply

David_K93
Contributor

04-26-2023 12:32:20 PM

1 kudos

Hello,I've been working through the demo. I keep running into an error saying 'chromadb is not defined' when trying to run Chroma functions. See the example below. Seems to be an embedded object name? Thanks!

1 kudos

04-26-2023 12:32:20 PM

2 More Replies

by vittal • New Contributor

01-24-2023 10:35:44 PM

649 Views
1 replies
0 kudos

Getting errors in DLT Pipeline while using ML Model

I am getting the following error when I try to run ML Models in Delta live Table Pipeline File "/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-55c61-9b898-2c4b6-d/mlflow/envs/virtualenv_envs/mlflow-888f8c9b966409e6bddca3894244b4df9d1f94c1/lib/pyth...

Machine Learning

Reply

649 Views
1 replies
0 kudos

01-24-2023 10:35:44 PM

View Replies

Latest Reply

shan_chandra
Honored Contributor III

04-27-2023 9:17:21 AM

0 kudos

@Vittal Pai - In general, please follow the below steps for the mlflow CLI error,Step 1: set up API token and create secrets as mentioned in the below documenthttps://docs.databricks.com/machine-learning/manage-model-lifecycle/multiple-workspaces.h...

0 kudos

04-27-2023 9:17:21 AM

by Vaadee • New Contributor

04-21-2023 8:08:01 PM

627 Views
1 replies
0 kudos

How to include additional feature columns in Databricks AutoML Forecast?

I'm using Databricks AutoML for time series forecasting, and I would like to include additional feature columns in my model to improve its performance. The available parameters in the databricks.automl.forecast() function primarily focus on the targ...

Machine Learning

Reply

627 Views
1 replies
0 kudos

04-21-2023 8:08:01 PM

View Replies

Latest Reply

shyam_9
Valued Contributor

04-26-2023 2:23:56 PM

0 kudos

Hi @Vaadeendra Kumar Burra, I am checking internally, will update you on this.

0 kudos

04-26-2023 2:23:56 PM

by prem_raj • New Contributor II

04-21-2023 3:34:17 AM

1260 Views
2 replies
0 kudos

AutoMl Forecasting - Query via REST (Issue with input date field)

Hi , Used automl forecasting model with sample data and the model is trained successfully. But when i was to serve the model over REST endpoint, i'm getting the error while querying via the inbuilt browser and postman. (Error seems to be with the dat...

Machine Learning

Reply

1260 Views
2 replies
0 kudos

04-21-2023 3:34:17 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-25-2023 10:23:29 PM

0 kudos

@prem raj :Based on the error message, it seems that the input date format is not compatible with the model for inference. The error message suggests that the input date format is timezone-aware, while the model expects a timezone-naive format.To fi...

0 kudos

04-25-2023 10:23:29 PM

1 More Replies

by notsure • New Contributor

02-20-2023 3:46:12 AM

1527 Views
3 replies
2 kudos

Error with calling a machine learning serving endpoint

Hi!I have registered a spark model and generated a serving endpoint based on that.I am calling the endpoint with the relevant dataframe, somehow I got below errors. Could anyone show me how to tackle it, please? "Exception: Request failed with status...

Machine Learning

Reply

1527 Views
3 replies
2 kudos

02-20-2023 3:46:12 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-21-2023 10:08:26 PM

2 kudos

Hi @mavis chen Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

2 kudos

04-21-2023 10:08:26 PM

2 More Replies

by Saeid_H • Contributor

02-17-2023 8:02:08 AM

1699 Views
2 replies
0 kudos

Logging spark pipeline model using mlflow spark , leads to PythonSecurityException

Hello,I am currently using a simple pyspark pipeline to transform my training data, fit model and log the model using mlflow.spark. But I get this following error (with mlflow.sklearn it works perfectly fine but due to size of my data I need to use p...

Machine Learning

Reply

1699 Views
2 replies
0 kudos

02-17-2023 8:02:08 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-21-2023 2:01:44 AM

0 kudos

Hi @Saeid Hedayati Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answer...

0 kudos

04-21-2023 2:01:44 AM

1 More Replies

by ACP • New Contributor III

12-08-2022 2:45:21 PM

526 Views
2 replies
0 kudos

Didn't receive badges / points upon courses completion

Hi @Juliet Wu ,I have completed a few courses but didn't receive any badges or points. I also did an accreditation but also didn't receive anything.

Machine Learning

Reply

526 Views
2 replies
0 kudos

12-08-2022 2:45:21 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-21-2023 12:21:29 AM

0 kudos

Hi @Juliet Wu Thank you for reaching out! Please submit a ticket to our Training Team here: https://help.databricks.com/s/contact-us?ReqType=training and our team will get back to you shortly.

0 kudos

04-21-2023 12:21:29 AM

1 More Replies

by sridhar0109 • New Contributor

02-15-2023 2:55:16 AM

466 Views
2 replies
0 kudos

Tracking changes in data distribution by using pyspark

Hi All,I'm working on creating a data quality dashboard. I've created few rules like checking nulls in a column, checking for data type of the column , removing duplicates etc.We follow medallion architecture and are applying these data quality check...

Machine Learning

Reply

466 Views
2 replies
0 kudos

02-15-2023 2:55:16 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-20-2023 9:50:24 PM

0 kudos

Hi @Sridhar Varanasi Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.T...

0 kudos

04-20-2023 9:50:24 PM

1 More Replies

by hulma • New Contributor II

02-15-2023 2:16:46 AM

412 Views
2 replies
1 kudos

dbfs file reference in pyfunc model for serverless inference

Hi, I was trying to migrate model serving from classic to serverless realtime inference.My model is currently being logged as pyfunc model and part of model script is to read dbfs file for inference. Now, with serverless i have error which it not abl...

Machine Learning

Reply

412 Views
2 replies
1 kudos

02-15-2023 2:16:46 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-20-2023 9:47:03 PM

1 kudos

Hi @Hulma Abdul Rahman Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best an...

1 kudos

04-20-2023 9:47:03 PM

1 More Replies

by Gilg • Contributor II

04-17-2023 12:29:23 PM

4416 Views
1 replies
0 kudos

Failed to add 1 container to the cluster. will attempt retry: false. reason: bootstrap timeout

Hi Team,When creating a new cluster in a workspace within a VNET receiving this error:Failed to add 1 container to the cluster. will attempt retry: false. reason: bootstrap timeoutCluster terminated. Reason: Bootstrap TimeoutCheers.Gil

Machine Learning

Reply

4416 Views
1 replies
0 kudos

04-17-2023 12:29:23 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-20-2023 7:39:58 PM

0 kudos

@Gil Gonong :The error message you are receiving suggests that the creation of the new cluster has failed due to a bootstrap timeout. The bootstrap process is responsible for setting up the initial configuration of the cluster, and if it takes too l...

0 kudos

04-20-2023 7:39:58 PM

by isaac_gritz • Valued Contributor II

08-23-2022 1:17:59 AM

1976 Views
1 replies
3 kudos

Resolved! Pricing on Databricks

How Pricing Works on DatabricksI highly recommend checking out this blog post on how databricks pricing works from my colleague @MENDELSOHN CHANDatabricks has a consumption based pricing model, so you pay only for the compute you use.For interactive...

Machine Learning

Reply

1976 Views
1 replies
3 kudos

08-23-2022 1:17:59 AM

View Replies

Latest Reply

Meag
New Contributor III

04-20-2023 3:13:14 AM

3 kudos

I read the read blog you will share it helps thanks for sharing.

3 kudos

04-20-2023 3:13:14 AM

by Santhanalakshmi • New Contributor II

07-13-2022 10:25:17 PM

1411 Views
3 replies
0 kudos

Throwing IndexoutofBound Exception in Pyspark

Hello All,I am trying to read the data and trying to group the data in order to pass it to predict function via @F.pandas_udf method.#Loading Model pkl_model = pickle.load(open(filepath,'rb')) # build schema for output labels filter_schema=[] ...

Machine Learning

Reply

1411 Views
3 replies
0 kudos

07-13-2022 10:25:17 PM

View Replies

Latest Reply

Vindhya
New Contributor II

04-18-2023 1:30:14 PM

0 kudos

@Santhanalakshmi Manoharan Was this issue resolved, Am also getting same error, any guidance would be of great help.Appreciate your help.

0 kudos

04-18-2023 1:30:14 PM

2 More Replies

by its-kumar • New Contributor III

04-14-2023 12:46:55 AM

2808 Views
2 replies
0 kudos

MLFlow Remote model registry connection is not working in Databricks

Dear community,I am having multiple Databricks workspaces in my azure subscription, and I have one central workspace. I want to use the central workspace for model registry and experiments tracking from the multiple other workspaces.So, If I am train...

Machine Learning

Reply

2808 Views
2 replies
0 kudos

04-14-2023 12:46:55 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-18-2023 2:22:05 AM

0 kudos

@Kumar Shanu :The error you are seeing (API request to endpoint /api/2.0/mlflow/runs/create failed with error code 404 != 200) suggests that the API endpoint you are trying to access is not found. This could be due to several reasons, such as incorr...

0 kudos

04-18-2023 2:22:05 AM

1 More Replies

Databricks

Forum Posts

Resolved! What is the disadvantage of using multiple Z-Order columns?

Optimize operation with big increase in numRemovedFiles/numRemovedBytes/numAddedFiles/numAddedBytes

www.dbdemos.ai

Getting errors in DLT Pipeline while using ML Model

How to include additional feature columns in Databricks AutoML Forecast?

AutoMl Forecasting - Query via REST (Issue with input date field)

Error with calling a machine learning serving endpoint

Logging spark pipeline model using mlflow spark , leads to PythonSecurityException

Didn't receive badges / points upon courses completion

Tracking changes in data distribution by using pyspark

dbfs file reference in pyfunc model for serverless inference

Failed to add 1 container to the cluster. will attempt retry: false. reason: bootstrap timeout

Resolved! Pricing on Databricks

Throwing IndexoutofBound Exception in Pyspark

MLFlow Remote model registry connection is not working in Databricks

pdb debugger on databricks

import ml.dmlc.xgboost4j.scala.spark.{XGBoostEstim...

Query ML Endpoint with R and Curl

'error_code': 'INVALID_PARAMETER_VALUE', 'message'...

AutoMl Dataset too large