- 1544 Views
- 0 replies
- 1 kudos
Is is possible to pass parameters to the display() function to tell it to create a specific type of plots? There are scenarios, such as running notebooks in Databricks Jobs, where it is not possible to use the display() UI to select the plot type and...
- 1544 Views
- 0 replies
- 1 kudos
by
thib
• New Contributor III
- 5624 Views
- 4 replies
- 2 kudos
I have created a feature table (Databricks runtime ML 10.2) that includes a timestamp column as a primary key, that is not used as a feature but as a column to join on.I have then created a model that trains from this feature table and some additiona...
- 5624 Views
- 4 replies
- 2 kudos
Latest Reply
Hi, it did not, but at least I know they are not fully supported so a workaround is to avoid timestamps, so I suppose you can mark this as resolved
3 More Replies
- 4007 Views
- 1 replies
- 0 kudos
I've tried to start a single cluster 4 times on Databricks Community Edition today (13 March 2022). It's failed every time. Here's the first part of the output summary.```Time2022-03-13 13:59:14 EDTMessageCluster terminated.Reason:Unexpected launch f...
- 4007 Views
- 1 replies
- 0 kudos
Latest Reply
Hi @Noel Jameson​ We have some internal service interruptions due to which we had this issue. Our engineering has applied the fix and the cluster startup works as expected. Sincerely apologies for the inconvenience caused here.Regards,Darshan
- 5424 Views
- 1 replies
- 32 kudos
Databricks Roadmap AzureThere are a lot of excitement new features coming in 2022. I tried to put them all on one list:Unity catalog (seems that it will exists next to hive metastore and it will be possible to migrate)Control metastore, unity creatio...
- 5424 Views
- 1 replies
- 32 kudos
- 3040 Views
- 2 replies
- 2 kudos
Hi, for AutoML, I see that the data has to reside in dbfs to read and run AutoML on top of it. In my environment, dbfs is locked for security reasons. Is there a workaround or another way to access data or maybe from S3 bucket?
- 3040 Views
- 2 replies
- 2 kudos
Latest Reply
Atanu
Databricks Employee
@Silky Sharad Shah​ please look into the doc https://docs.databricks.com/data/data-sources/aws/amazon-s3.html?&_ga=2.228395418.684786035.1646666830-480220406.1638459894#access-s3-buckets-directly . this might help you.
1 More Replies
- 3780 Views
- 2 replies
- 6 kudos
I m currently looking for information on whether Spark NLP can run fine on Databricks platform.Can someone please share - known issues/bugs encountered- any fixes or config settings required in environment- best practices to follow
- 3780 Views
- 2 replies
- 6 kudos
- 12417 Views
- 4 replies
- 0 kudos
My notebook is pulling in Hive tables from DBFS, that point to ADLS Gen1 file locations for their data (Delta tables), creating the feature table as a data frame within the notebook, then calling on the feature store client to save down the feature t...
- 12417 Views
- 4 replies
- 0 kudos
Latest Reply
Atanu
Databricks Employee
@Jack Watson​ Could you please confirm the write is succeeding ? If yes, as per my understanding This is a warning for some validation that we will be removing shortly. We’ll likely remove the validation which save the data source.Thanks.
3 More Replies
- 31911 Views
- 2 replies
- 4 kudos
Error: Please check network connectivity from the data plane to the control plane.{ "reason": { "code": "BOOTSTRAP_TIMEOUT", "parameters": { "databricks_error_message": "[id: InstanceId(i-0457092c), status: INSTANCE_INITIALIZING, workerEnvId:...
- 31911 Views
- 2 replies
- 4 kudos
Latest Reply
Can you please get the system logs from AWS EC2 console as soon the cluster fails - System Logs for the failed instance will be accessible from the AWS console up to an hour after the shutdown.AWS console clears the references of terminated clusters ...
1 More Replies
by
thib
• New Contributor III
- 3564 Views
- 3 replies
- 4 kudos
For timeseries feature tables, an inner join is made at the creation of the feature table. For the other type of feature tables, a left join is made, so NaN values can show up in the training set. Can the inner join in create_training_set() method be...
- 3564 Views
- 3 replies
- 4 kudos
Latest Reply
Thank you Hubert, that's a good alternative, I just thought I'd stick to the api as much as possible, but this solves it.
2 More Replies
by
SeanB
• New Contributor II
- 5337 Views
- 4 replies
- 0 kudos
It looks like you can via MLflow but I wanted to check before diving deeper?Also it seems like if it is possible, it's just for small scale experimentation?Thank you!
- 5337 Views
- 4 replies
- 0 kudos
Latest Reply
Yes, If somebody outside Databricks can query/use a model built in Databricks. I assume the answer must be yes?
3 More Replies
- 2561 Views
- 1 replies
- 0 kudos
I'm fitting multiple models in parallel. For each one, I'm logging lots of params and metrics to MLflow. I'm hitting rate limits, causing problems in my jobs.
- 2561 Views
- 1 replies
- 0 kudos
Latest Reply
The first thing to try is to log in batches. If you are logging each param and metric separately, you're making 1 API call per param and 1 per metric. Instead, you should use the batch logging APIs; e.g. use "log_params" instead of "log_param" http...
- 3828 Views
- 1 replies
- 3 kudos
I am following the Apache Sparkâ„¢ Tutorial. When I finish the data set part and want to continue the machine learning part. I found the page is empty. The next section after machine learning is fine. So I guess there must be a url mismatching.The url ...
- 3828 Views
- 1 replies
- 3 kudos
Latest Reply
I clean the cookie and then the link recovers. So it is an issue about cookie.
- 3062 Views
- 0 replies
- 0 kudos
As I am moving my first steps within the Databricks Machine Learning Workspace, I am getting confused by some features that by "documentation" seem to overlap. Does autolog for spark on mlflow provide different tracking than using a training set crea...
- 3062 Views
- 0 replies
- 0 kudos
by
Saeed
• New Contributor II
- 8972 Views
- 2 replies
- 1 kudos
I am facing an issue in loading a ML artifact for a specific run by search the experiment runs to get a specific run_id as follows:https://www.mlflow.org/docs/latest/rest-api.html#search-runsAPI request to https://eastus-c3.azuredatabricks.net/api/2....
- 8972 Views
- 2 replies
- 1 kudos
Latest Reply
Yes, you will hit rate limits if you try to query the API so fast in parallel. Do you just want to manipulate the run data in an experiment with Spark? you can simply load all that data in a DataFrame with spark.read.format("mlflow-experiment").load(...
1 More Replies
- 3187 Views
- 1 replies
- 0 kudos
When should I use Spark ML's CrossValidator or TrainValidationSplit, vs. a separate tuning tool such as Hyperopt?
- 3187 Views
- 1 replies
- 0 kudos
Latest Reply
Both are valid choices. By default, I'd recommend using Hyperopt nowadays. Here's the rationale, as pros & cons of each.Spark ML's built-in toolsPros: These fit the Spark ML Pipeline framework, so you can keep using the same type of APIs.Cons: Thes...