cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

User16826994223
by Honored Contributor III
  • 2247 Views
  • 1 replies
  • 0 kudos

Delta Lake MERGE INTO statement error

I'm trying to run Delta Lake MergeMERGE INTO source USING updates ON source.d = updates.sessionId WHEN MATCHED THEN UPDATE * WHEN NOT MATCHED THEN INSERT *I'm getting an SQL errorParseException: mismatched input 'MERGE' expecting {'(', 'SELECT', 'FR...

  • 2247 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

The merge SQL support is added in Delta Lake 0.7.0. You also need to upgrade your Apache Spark to 3.0.0 and enable the integration with Apache Spark DataSourceV2 and C

  • 0 kudos
User16765131552
by Contributor III
  • 2806 Views
  • 1 replies
  • 0 kudos

Resolved! Setup a model serving REST endpoint?

I am trying to set up a demo with a really simple spark ML model and i see this error repeated over and over in the logs in the serving UI:/databricks/chauffeur/model-runner/lib/python3.6/site-packages/urllib3/connectionpool.py:1020: InsecureRequestW...

  • 2806 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16765131552
Contributor III
  • 0 kudos

Not sure how the containers for each model version work on the endpoints, but looks like Model serving endpoints use a 7.x runtime. So those would be Spark 3.0, not Spark 3.1

  • 0 kudos
User16826994223
by Honored Contributor III
  • 1705 Views
  • 1 replies
  • 0 kudos

Using l vacuum with a dry run in Python for a Delta Lake

I can see an example on how to call the vacuum function for a Delta lake in python here. how to use the same in python %sql VACUUM delta.`dbfs:/mnt/<myfolder>` DRY RUN

  • 1705 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

The dry run for non-SQL code is not yet available in Delta version 0.8. I see there is a bug that is opened with delta opensource in git . hope it get resolved soon

  • 0 kudos
User16826992666
by Valued Contributor
  • 1647 Views
  • 0 replies
  • 0 kudos

MLflow not logging metrics

I have run a few MLflow experiments and I can see them in the experiment history, but none of the metrics have been logged along with them. I thought this was supposed to be automatically included. Any idea why they wouldn't be showing up?

  • 1647 Views
  • 0 replies
  • 0 kudos
User16826994223
by Honored Contributor III
  • 1707 Views
  • 1 replies
  • 0 kudos

Resolved! where Can I find the the logs of spark job runs in Azure storage

Hi Want to find the storage bucket where all my runs' logs are stored , I want to do analytics on logs , can you please help me knowing which bucket or path I should look for

  • 1707 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

The root bucket where are not directly accessible outside databricks so you need to read the logs from databricks notebook only

  • 0 kudos
User16826994223
by Honored Contributor III
  • 2281 Views
  • 1 replies
  • 0 kudos

Resolved! Exception: Run with UUID l567845ae5a7cf04a40902ae789076093c is already active.

I'm trying to create a new experiment on mlflow but I have this problem:Exception: Run with UUID l142ae5a7cf04a40902ae9ed7326093c is already active. snippet mlflow.set_experiment("New experiment 2")     mlflow.set_tracking_uri('http://mlflow:5000')  ...

  • 2281 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

You have to run mlflow.end_run() to finish the first experiment. Then you can create another

  • 0 kudos
User16826994223
by Honored Contributor III
  • 1003 Views
  • 1 replies
  • 0 kudos

What is the preview feature for Auto ML

What is the preview feature for Auto ML

  • 1003 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

A - AutoML public preview featuresThe Databricks AutoML Public Preview parallelizes training over sklearn and xgboost models for classification (binary and multiclass) and regression problems. We support datasets with numerical, categorical and times...

  • 0 kudos
brickster_2018
by Databricks Employee
  • 2100 Views
  • 1 replies
  • 0 kudos
  • 2100 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

The impact will be only on the files touched by the MERGE operation. The newly created files will not be optimized and data co-locality is not ensured. However, the files which are not touched by the MERGE operation will continue to show the improvem...

  • 0 kudos
User16789201666
by Databricks Employee
  • 1093 Views
  • 0 replies
  • 0 kudos

What's a best practice for Hyperopt workflow?

Choose what hyperparameters are reasonable to optimizeDefine broad ranges for each of the hyperparameters (including the default where applicable)Run a small number of trialsObserve the results in an MLflow parallel coordinate plot and select the run...

  • 1093 Views
  • 0 replies
  • 0 kudos
User16789201666
by Databricks Employee
  • 4566 Views
  • 0 replies
  • 0 kudos

When to use uniform vs log-uniform in Hyperopt?

Hyperopt offers hp.uniform and hp.loguniform, both of which produce real values in a min/max range. hp.loguniform is more suitable when one might choose a geometric series of values to try (0.001, 0.01, 0.1) rather than arithmetic (0.1, 0.2, 0.3). Wh...

  • 4566 Views
  • 0 replies
  • 0 kudos
User16826994223
by Honored Contributor III
  • 2514 Views
  • 1 replies
  • 0 kudos

Which file size is better 1 GB file size in target or 128 MB or lesser than that

Which file size is better 1 GB file size in target or 128 MB or lesser than that , I am interested in knowing concept too.

  • 2514 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

If data is getting appended primarily to the delta table and read ratio is higher than writes ratio - larger file sizes ( 1GB) would be ideal. However, if your delta table undergoes frequent upserts/merges, having smaller files than the default 1GB ...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels