cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

qwerty1
by Contributor
  • 570 Views
  • 1 replies
  • 1 kudos

Resolved! What is the disadvantage of using multiple Z-Order columns?

The documentation statesYou can specify multiple columns for  ZORDER BY as a comma-separated list. However, the effectiveness of the locality drops with each extra columnWhat does it mean for "effectiveness of the locality to drop" with each extra co...

  • 570 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Ashwin Bhaskar​ :Z-ordering is a technique to improve the performance of queries that involve filtering and grouping on specific columns in a large distributed database. When a table is z-ordered on a certain column or set of columns, the data is so...

  • 1 kudos
khh2023
by New Contributor
  • 725 Views
  • 1 replies
  • 0 kudos

Optimize operation with big increase in numRemovedFiles/numRemovedBytes/numAddedFiles/numAddedBytes

Hello, I have a daily loading process for a delta table and has a ‘optimize table’ step at the end. The optimize operation used to take about 5 minutes, but now takes about 3.5 hours. One thing I noticed from 'describe history' is the operationMetric...

image.png
  • 725 Views
  • 1 replies
  • 0 kudos
Latest Reply
mathan_pillai
Valued Contributor
  • 0 kudos

This is most likely because more files became eligible for compaction (optimize). By default there is a limit of 50 files or so per partition, below which the partition doesn't qualify for optimize. Only if there are 50+ files within a partition the...

  • 0 kudos
Anonymous
by Not applicable
  • 977 Views
  • 3 replies
  • 1 kudos

www.dbdemos.ai

Hurray!! Dolly demo is live now Build your Chat Bot with Dolly now. Experiment and let us know how do you feel about it.https://www.dbdemos.ai/demo.html?demoName=llm-dolly-chatbot

  • 977 Views
  • 3 replies
  • 1 kudos
Latest Reply
David_K93
Contributor
  • 1 kudos

Hello,I've been working through the demo. I keep running into an error saying 'chromadb is not defined' when trying to run Chroma functions. See the example below. Seems to be an embedded object name? Thanks!

  • 1 kudos
2 More Replies
vittal
by New Contributor
  • 649 Views
  • 1 replies
  • 0 kudos

Getting errors in DLT Pipeline while using ML Model

I am getting the following error when I try to run ML Models in Delta live Table Pipeline File "/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-55c61-9b898-2c4b6-d/mlflow/envs/virtualenv_envs/mlflow-888f8c9b966409e6bddca3894244b4df9d1f94c1/lib/pyth...

  • 649 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Honored Contributor III
  • 0 kudos

@Vittal Pai​  - In general, please follow the below steps for the mlflow CLI error,Step 1: set up API token and create secrets as mentioned in the below documenthttps://docs.databricks.com/machine-learning/manage-model-lifecycle/multiple-workspaces.h...

  • 0 kudos
Vaadee
by New Contributor
  • 627 Views
  • 1 replies
  • 0 kudos

How to include additional feature columns in Databricks AutoML Forecast?

I'm using Databricks AutoML for time series forecasting, and I would like to include additional feature columns in my model to improve its performance. The available parameters in the databricks.automl.forecast() function primarily focus on the targ...

  • 627 Views
  • 1 replies
  • 0 kudos
Latest Reply
shyam_9
Valued Contributor
  • 0 kudos

Hi @Vaadeendra Kumar Burra​, I am checking internally, will update you on this.

  • 0 kudos
prem_raj
by New Contributor II
  • 1260 Views
  • 2 replies
  • 0 kudos

AutoMl Forecasting - Query via REST (Issue with input date field)

Hi , Used automl forecasting model with sample data and the model is trained successfully. But when i was to serve the model over REST endpoint, i'm getting the error while querying via the inbuilt browser and postman. (Error seems to be with the dat...

  • 1260 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@prem raj​ :Based on the error message, it seems that the input date format is not compatible with the model for inference. The error message suggests that the input date format is timezone-aware, while the model expects a timezone-naive format.To fi...

  • 0 kudos
1 More Replies
notsure
by New Contributor
  • 1527 Views
  • 3 replies
  • 2 kudos

Error with calling a machine learning serving endpoint

Hi!I have registered a spark model and generated a serving endpoint based on that.I am calling the endpoint with the relevant dataframe, somehow I got below errors. Could anyone show me how to tackle it, please? "Exception: Request failed with status...

  • 1527 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @mavis chen​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

  • 2 kudos
2 More Replies
Saeid_H
by Contributor
  • 1699 Views
  • 2 replies
  • 0 kudos

Logging spark pipeline model using mlflow spark , leads to PythonSecurityException

Hello,I am currently using a simple pyspark pipeline to transform my training data, fit model and log the model using mlflow.spark. But I get this following error (with mlflow.sklearn it works perfectly fine but due to size of my data I need to use p...

  • 1699 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Saeid Hedayati​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answer...

  • 0 kudos
1 More Replies
ACP
by New Contributor III
  • 526 Views
  • 2 replies
  • 0 kudos

Didn't receive badges / points upon courses completion

Hi @Juliet Wu​ ,I have completed a few courses but didn't receive any badges or points. I also did an accreditation but also didn't receive anything.

  • 526 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Juliet Wu​ Thank you for reaching out! Please submit a ticket to our Training Team here: https://help.databricks.com/s/contact-us?ReqType=training  and our team will get back to you shortly. 

  • 0 kudos
1 More Replies
sridhar0109
by New Contributor
  • 466 Views
  • 2 replies
  • 0 kudos

Tracking changes in data distribution by using pyspark

Hi All,I'm working on creating a data quality dashboard. I've created few rules like checking nulls in a column, checking for data type of the column , removing duplicates etc.We follow medallion architecture and are applying these data quality check...

  • 466 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Sridhar Varanasi​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.T...

  • 0 kudos
1 More Replies
hulma
by New Contributor II
  • 412 Views
  • 2 replies
  • 1 kudos

dbfs file reference in pyfunc model for serverless inference

Hi, I was trying to migrate model serving from classic to serverless realtime inference.My model is currently being logged as pyfunc model and part of model script is to read dbfs file for inference. Now, with serverless i have error which it not abl...

  • 412 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Hulma Abdul Rahman​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best an...

  • 1 kudos
1 More Replies
Gilg
by Contributor II
  • 4416 Views
  • 1 replies
  • 0 kudos

Failed to add 1 container to the cluster. will attempt retry: false. reason: bootstrap timeout

Hi Team,When creating a new cluster in a workspace within a VNET receiving this error:Failed to add 1 container to the cluster. will attempt retry: false. reason: bootstrap timeoutCluster terminated. Reason: Bootstrap TimeoutCheers.Gil

  • 4416 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Gil Gonong​ :The error message you are receiving suggests that the creation of the new cluster has failed due to a bootstrap timeout. The bootstrap process is responsible for setting up the initial configuration of the cluster, and if it takes too l...

  • 0 kudos
isaac_gritz
by Valued Contributor II
  • 1976 Views
  • 1 replies
  • 3 kudos

Resolved! Pricing on Databricks

How Pricing Works on DatabricksI highly recommend checking out this blog post on how databricks pricing works from my colleague @MENDELSOHN CHAN​Databricks has a consumption based pricing model, so you pay only for the compute you use.For interactive...

  • 1976 Views
  • 1 replies
  • 3 kudos
Latest Reply
Meag
New Contributor III
  • 3 kudos

I read the read blog you will share it helps thanks for sharing.

  • 3 kudos
Santhanalakshmi
by New Contributor II
  • 1411 Views
  • 3 replies
  • 0 kudos

Throwing IndexoutofBound Exception in Pyspark

Hello All,I am trying to read the data and trying to group the data in order to pass it to predict function via @F.pandas_udf method.#Loading Model pkl_model = pickle.load(open(filepath,'rb'))   # build schema for output labels filter_schema=[] ...

error_db error_2_db error_3_db
  • 1411 Views
  • 3 replies
  • 0 kudos
Latest Reply
Vindhya
New Contributor II
  • 0 kudos

@Santhanalakshmi Manoharan​  Was this issue resolved, Am also getting same error, any guidance would be of great help.Appreciate your help.

  • 0 kudos
2 More Replies
its-kumar
by New Contributor III
  • 2808 Views
  • 2 replies
  • 0 kudos

MLFlow Remote model registry connection is not working in Databricks

Dear community,I am having multiple Databricks workspaces in my azure subscription, and I have one central workspace. I want to use the central workspace for model registry and experiments tracking from the multiple other workspaces.So, If I am train...

  • 2808 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Kumar Shanu​ :The error you are seeing (API request to endpoint /api/2.0/mlflow/runs/create failed with error code 404 != 200) suggests that the API endpoint you are trying to access is not found. This could be due to several reasons, such as incorr...

  • 0 kudos
1 More Replies
Labels