cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

michael_wm
by New Contributor II
  • 1544 Views
  • 0 replies
  • 1 kudos

Can display() plots be controlled programmatically?

Is is possible to pass parameters to the display() function to tell it to create a specific type of plots? There are scenarios, such as running notebooks in Databricks Jobs, where it is not possible to use the display() UI to select the plot type and...

  • 1544 Views
  • 0 replies
  • 1 kudos
thib
by New Contributor III
  • 5624 Views
  • 4 replies
  • 2 kudos

Resolved! Feature Store : for sklearn flavored models, are timestamps fully supported?

I have created a feature table (Databricks runtime ML 10.2) that includes a timestamp column as a primary key, that is not used as a feature but as a column to join on.I have then created a model that trains from this feature table and some additiona...

  • 5624 Views
  • 4 replies
  • 2 kudos
Latest Reply
thib
New Contributor III
  • 2 kudos

Hi, it did not, but at least I know they are not fully supported so a workaround is to avoid timestamps, so I suppose you can mark this as resolved

  • 2 kudos
3 More Replies
njjameson
by New Contributor
  • 4007 Views
  • 1 replies
  • 0 kudos

Resolved! Cluster terminated in Databricks Community Edition

I've tried to start a single cluster 4 times on Databricks Community Edition today (13 March 2022). It's failed every time. Here's the first part of the output summary.```Time2022-03-13 13:59:14 EDTMessageCluster terminated.Reason:Unexpected launch f...

  • 4007 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16753724663
Databricks Employee
  • 0 kudos

Hi @Noel Jameson​ We have some internal service interruptions due to which we had this issue. Our engineering has applied the fix and the cluster startup works as expected. Sincerely apologies for the inconvenience caused here.Regards,Darshan

  • 0 kudos
Hubert-Dudek
by Databricks MVP
  • 5424 Views
  • 1 replies
  • 32 kudos

Databricks Roadmap Azure There are a lot of excitement new features coming in 2022. I tried to put them all on one list: Unity catalog (seems that it ...

Databricks Roadmap AzureThere are a lot of excitement new features coming in 2022. I tried to put them all on one list:Unity catalog (seems that it will exists next to hive metastore and it will be possible to migrate)Control metastore, unity creatio...

  • 5424 Views
  • 1 replies
  • 32 kudos
Verisk
by New Contributor
  • 3042 Views
  • 2 replies
  • 2 kudos

Resolved! DBFS for AutoML

Hi, for AutoML, I see that the data has to reside in dbfs to read and run AutoML on top of it. In my environment, dbfs is locked for security reasons. Is there a workaround or another way to access data or maybe from S3 bucket?

  • 3042 Views
  • 2 replies
  • 2 kudos
Latest Reply
Atanu
Databricks Employee
  • 2 kudos

@Silky Sharad Shah​  please look into the doc https://docs.databricks.com/data/data-sources/aws/amazon-s3.html?&_ga=2.228395418.684786035.1646666830-480220406.1638459894#access-s3-buckets-directly . this might help you.

  • 2 kudos
1 More Replies
trkrishnan
by New Contributor III
  • 3781 Views
  • 2 replies
  • 6 kudos

Resolved! Spark nlp on Databricks - looking for known issues/best practices

I m currently looking for information on whether Spark NLP can run fine on Databricks platform.Can someone please share - known issues/bugs encountered- any fixes or config settings required in environment- best practices to follow

  • 3781 Views
  • 2 replies
  • 6 kudos
Latest Reply
trkrishnan
New Contributor III
  • 6 kudos

Thanks a lot for the quick response

  • 6 kudos
1 More Replies
Jack_Watson
by Contributor
  • 12419 Views
  • 4 replies
  • 0 kudos

Resolved! I am saving a new feature table to the Databricks feature store, and it won't write the data sources of the tables used to create the feature table, because they are Hive tables that point to Azure Data Lake Storage Gen1 Delta tables

My notebook is pulling in Hive tables from DBFS, that point to ADLS Gen1 file locations for their data (Delta tables), creating the feature table as a data frame within the notebook, then calling on the feature store client to save down the feature t...

  • 12419 Views
  • 4 replies
  • 0 kudos
Latest Reply
Atanu
Databricks Employee
  • 0 kudos

@Jack Watson​  Could you please confirm the write is succeeding ? If yes, as per my understanding This is a warning for some validation that we will be removing shortly. We’ll likely remove the validation which save the data source.Thanks.

  • 0 kudos
3 More Replies
User16826988699
by Databricks Employee
  • 31911 Views
  • 2 replies
  • 4 kudos

Resolved! Problem with spinning up a cluster on a new workspace

Error: Please check network connectivity from the data plane to the control plane.{ "reason": {   "code": "BOOTSTRAP_TIMEOUT",   "parameters": {     "databricks_error_message": "[id: InstanceId(i-0457092c), status: INSTANCE_INITIALIZING, workerEnvId:...

  • 31911 Views
  • 2 replies
  • 4 kudos
Latest Reply
User16725394280
Databricks Employee
  • 4 kudos

Can you please get the system logs from AWS EC2 console as soon the cluster fails - System Logs for the failed instance will be accessible from the AWS console up to an hour after the shutdown.AWS console clears the references of terminated clusters ...

  • 4 kudos
1 More Replies
thib
by New Contributor III
  • 3564 Views
  • 3 replies
  • 4 kudos

Resolved! Feature store : Can create_training_set() be implemented to execute an inner join?

For timeseries feature tables, an inner join is made at the creation of the feature table. For the other type of feature tables, a left join is made, so NaN values can show up in the training set. Can the inner join in create_training_set() method be...

  • 3564 Views
  • 3 replies
  • 4 kudos
Latest Reply
thib
New Contributor III
  • 4 kudos

Thank you Hubert, that's a good alternative, I just thought I'd stick to the api as much as possible, but this solves it.

  • 4 kudos
2 More Replies
SeanB
by New Contributor II
  • 5337 Views
  • 4 replies
  • 0 kudos

Can you deploy models that can be queried/called/inferred outside your organization?

It looks like you can via MLflow but I wanted to check before diving deeper?Also it seems like if it is possible, it's just for small scale experimentation?Thank you!

  • 5337 Views
  • 4 replies
  • 0 kudos
Latest Reply
SeanB
New Contributor II
  • 0 kudos

Yes, If somebody outside Databricks can query/use a model built in Databricks. I assume the answer must be yes?

  • 0 kudos
3 More Replies
Joseph_B
by Databricks Employee
  • 2562 Views
  • 1 replies
  • 0 kudos

What can I do to reduce the number of MLflow API calls I make?

I'm fitting multiple models in parallel. For each one, I'm logging lots of params and metrics to MLflow. I'm hitting rate limits, causing problems in my jobs.

  • 2562 Views
  • 1 replies
  • 0 kudos
Latest Reply
Joseph_B
Databricks Employee
  • 0 kudos

The first thing to try is to log in batches. If you are logging each param and metric separately, you're making 1 API call per param and 1 per metric. Instead, you should use the batch logging APIs; e.g. use "log_params" instead of "log_param" http...

  • 0 kudos
self-employed
by Contributor
  • 3832 Views
  • 1 replies
  • 3 kudos

Resolved! Is the machine learning part of "Apache Sparkâ„¢ Tutorial: Getting Started with Apache Spark on Databricks" missing or no longer available?

I am following the Apache Sparkâ„¢ Tutorial. When I finish the data set part and want to continue the machine learning part. I found the page is empty. The next section after machine learning is fine. So I guess there must be a url mismatching.The url ...

  • 3832 Views
  • 1 replies
  • 3 kudos
Latest Reply
self-employed
Contributor
  • 3 kudos

I clean the cookie and then the link recovers. So it is an issue about cookie.

  • 3 kudos
Edmondo
by New Contributor III
  • 3062 Views
  • 0 replies
  • 0 kudos

MlFlow and Feature Store: mlflow.spark.autolog, using feature store on Databricks, FeatureStoreClient.log_model()?

As I am moving my first steps within the Databricks Machine Learning Workspace, I am getting confused by some features that by "documentation" seem to overlap. Does autolog for spark on mlflow provide different tracking than using a training set crea...

  • 3062 Views
  • 0 replies
  • 0 kudos
Saeed
by New Contributor II
  • 8973 Views
  • 2 replies
  • 1 kudos

Resolved! MLFlow search runs getting http 429 error

I am facing an issue in loading a ML artifact for a specific run by search the experiment runs to get a specific run_id as follows:https://www.mlflow.org/docs/latest/rest-api.html#search-runsAPI request to https://eastus-c3.azuredatabricks.net/api/2....

  • 8973 Views
  • 2 replies
  • 1 kudos
Latest Reply
sean_owen
Databricks Employee
  • 1 kudos

Yes, you will hit rate limits if you try to query the API so fast in parallel. Do you just want to manipulate the run data in an experiment with Spark? you can simply load all that data in a DataFrame with spark.read.format("mlflow-experiment").load(...

  • 1 kudos
1 More Replies
Joseph_B
by Databricks Employee
  • 3187 Views
  • 1 replies
  • 0 kudos

For tuning hyperparameters with Apache Spark ML / MLlib, when should I use Spark ML's built-in tuning algorithms vs. Hyperopt?

When should I use Spark ML's CrossValidator or TrainValidationSplit, vs. a separate tuning tool such as Hyperopt?

  • 3187 Views
  • 1 replies
  • 0 kudos
Latest Reply
Joseph_B
Databricks Employee
  • 0 kudos

Both are valid choices. By default, I'd recommend using Hyperopt nowadays. Here's the rationale, as pros & cons of each.Spark ML's built-in toolsPros: These fit the Spark ML Pipeline framework, so you can keep using the same type of APIs.Cons: Thes...

  • 0 kudos
Labels