cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Science & Machine Learning

Forum Posts

Anonymous
by Not applicable
  • 1061 Views
  • 1 replies
  • 0 kudos
  • 1061 Views
  • 1 replies
  • 0 kudos
Latest Reply
KevinC_
New Contributor II
  • 0 kudos

I think I need a little more context here on what you're trying to achieve. If you're generally interested in schema evolution, this post talks about feature store: https://databricks.com/blog/2021/05/27/databricks-announces-the-first-feature-store-i...

  • 0 kudos
eyalwir
by New Contributor
  • 813 Views
  • 0 replies
  • 0 kudos

Deep Learning on Spark within AWS EMR

I'd like to use Deep Learning on Spark within AWS EMR.Is there a best practice or 'recommended' DL framework to run on Spark? It looks like Databricks' spark-deep-learning has been replaced by Horovod—should this the first option to consider? If th...

  • 813 Views
  • 0 replies
  • 0 kudos
User16790091296
by Contributor II
  • 1179 Views
  • 1 replies
  • 0 kudos
  • 1179 Views
  • 1 replies
  • 0 kudos
Latest Reply
amr
Databricks Employee
  • 0 kudos

I am not aware of any special requirement for this migration, my suggestion to you is to try it on a small scale (one notebook) and observe the results showing in the tracker server, if everything looks OK, then migrate the rest.

  • 0 kudos
User16826994223
by Honored Contributor III
  • 2896 Views
  • 1 replies
  • 1 kudos
  • 2896 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 1 kudos

If you have configured your Structured Streaming query to use RocksDB as the state store, you can now get better visibility into the performance of RocksDB, with detailed metrics on get/put latencies, compaction latencies, cache hits, and so on. Thes...

  • 1 kudos
User16826994223
by Honored Contributor III
  • 620 Views
  • 0 replies
  • 1 kudos

docs.databricks.com

Advantage of using Photon EngineThe following summarizes the advantages of Photon:Supports SQL and equivalent DataFrame operations against Delta and Parquet tables.Expected to accelerate queries that process a significant amount of data (100GB+) and ...

  • 620 Views
  • 0 replies
  • 1 kudos
User16826994223
by Honored Contributor III
  • 2105 Views
  • 1 replies
  • 2 kudos
  • 2105 Views
  • 1 replies
  • 2 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 2 kudos

check if your workspace has the IP access list feature enabled, call the get feature status API (GET /workspace-conf). Pass keys=enableIpAccessLists as arguments to the request.In the response, the enableIpAccessListsthe field specifies either true o...

  • 2 kudos
User16826992666
by Valued Contributor
  • 2462 Views
  • 1 replies
  • 0 kudos

Can multiple users collaborate together on MLflow experiments?

Wondering about best practices for how to handle collaboration between multiple ML practitioners working on a single experiment. Do we have to share the same notebook between people or is it possible to have individual notebooks going but still work ...

  • 2462 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

Yes, multiple users could work on individual notebooks and still use the same experiment via mlflow.set_experiment(). You could also assign different permission levels to experiments from a governance point of view

  • 0 kudos
User16826992666
by Valued Contributor
  • 2332 Views
  • 1 replies
  • 0 kudos

Resolved! Can I save MLflow artifacts to locations other than the dbfs?

The default location or MLflow artifacts is on dbfs, but I would like to save my models to an alternative location. Is this supported, and if it is how can I accomplish it?

  • 2332 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

You could mount an s3 bucket in the workspace and save your model using the mounts DBFS path For e.gmodelpath = "/dbfs/my-s3-bucket/model-%f-%f" % (alpha, l1_ratio) mlflow.sklearn.save_model(lr, modelpath)

  • 0 kudos
MoJaMa
by Databricks Employee
  • 977 Views
  • 1 replies
  • 0 kudos
  • 977 Views
  • 1 replies
  • 0 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 0 kudos

Data is stored in the control plane. Metadata (eg feature table descriptions, column types, etc) is stored in the control plane. The location where the Delta table is stored is determined by the database location. The customer could call  CREATE DATA...

  • 0 kudos
User16826990884
by New Contributor III
  • 1344 Views
  • 0 replies
  • 1 kudos

Dev and Prod environments

Do we have general guidance around how other customers manage Dev and Prod environments in Databricks? Is it recommended to have separate workspaces for them? What are the pros and cons of using the same workspace with folder or repo level isolation?

  • 1344 Views
  • 0 replies
  • 1 kudos
User16826994223
by Honored Contributor III
  • 2002 Views
  • 1 replies
  • 0 kudos

Delta Lake MERGE INTO statement error

I'm trying to run Delta Lake MergeMERGE INTO source USING updates ON source.d = updates.sessionId WHEN MATCHED THEN UPDATE * WHEN NOT MATCHED THEN INSERT *I'm getting an SQL errorParseException: mismatched input 'MERGE' expecting {'(', 'SELECT', 'FR...

  • 2002 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

The merge SQL support is added in Delta Lake 0.7.0. You also need to upgrade your Apache Spark to 3.0.0 and enable the integration with Apache Spark DataSourceV2 and C

  • 0 kudos
User16765131552
by Contributor III
  • 1540 Views
  • 1 replies
  • 0 kudos

Resolved! Setup a model serving REST endpoint?

I am trying to set up a demo with a really simple spark ML model and i see this error repeated over and over in the logs in the serving UI:/databricks/chauffeur/model-runner/lib/python3.6/site-packages/urllib3/connectionpool.py:1020: InsecureRequestW...

  • 1540 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16765131552
Contributor III
  • 0 kudos

Not sure how the containers for each model version work on the endpoints, but looks like Model serving endpoints use a 7.x runtime. So those would be Spark 3.0, not Spark 3.1

  • 0 kudos
User16826994223
by Honored Contributor III
  • 1520 Views
  • 1 replies
  • 0 kudos

Using l vacuum with a dry run in Python for a Delta Lake

I can see an example on how to call the vacuum function for a Delta lake in python here. how to use the same in python %sql VACUUM delta.`dbfs:/mnt/<myfolder>` DRY RUN

  • 1520 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

The dry run for non-SQL code is not yet available in Delta version 0.8. I see there is a bug that is opened with delta opensource in git . hope it get resolved soon

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels