Machine Learning

by Mado • Valued Contributor II

11-19-2022 9:51:18 PM

2286 Views
1 replies
4 kudos

Resolved! Error when reading Excel file: "org.apache.poi.ooxml.POIXMLException: Strict OOXML isn't currently supported, please see bug #57699"

Hi,I want to read an Excel "xlsx" file. The excel file has several sheets and multi-row header. The original file format was "xlsm" and I changed the extension to "xlsx". I try the following code:filepath_xlsx = "dbfs:/FileStore/Sample_Excel/data.xl...

Machine Learning

Reply

2286 Views
1 replies
4 kudos

11-19-2022 9:51:18 PM

View Replies

Latest Reply

Kaniz
Community Manager

11-20-2022 9:14:01 PM

4 kudos

Hi @Mohammad Saber, The error says, Don't save your spreadsheet in "strict OOXML" format.For example, in Excel use.Save As --> "Excel Workbook (.xlsx)" instead ofSave As --> "Strict Open XML Spreadsheet (.xlsx)"

4 kudos

11-20-2022 9:14:01 PM

by THIAM_HUATTAN • Valued Contributor

11-09-2022 12:59:34 AM

748 Views
3 replies
1 kudos

medium.datadriveninvestor.com

say, I want to download 2 files from this directory (dbfs:/databricks-datasets/abc-quality/") to my local filesystem, how do I do it?I understand that if those files are inside FileStore directory, it is much straightforward, which someone posts some...

Machine Learning

Reply

748 Views
3 replies
1 kudos

11-09-2022 12:59:34 AM

View Replies

Latest Reply

Pat
Honored Contributor III

11-09-2022 4:21:21 AM

1 kudos

Hi @THIAM HUAT TAN ,isn't this dbfs://databricks-datasets Databricks owned s3:// mounted to the workspace? You got an error - 403 access denied to PUT files into the s3 bucket: https://databricks-datasets-oregon.s3.us-west-2.amazonaws.com you shoul...

1 kudos

11-09-2022 4:21:21 AM

2 More Replies

by 614849 • New Contributor II

11-18-2022 7:20:52 AM

532 Views
0 replies
2 kudos

How to get the Probability of a prediction from Real Time Inference model

Hello,I have used AutoML to create a model. When using that model I want to have the probability of the predictions returned. I was able to do this in a notebook with:loaded_model = mlflow.pyfunc.load_model(logged_model)# Predict on a Pandas DataFram...

Machine Learning

Reply

532 Views
0 replies
2 kudos

11-18-2022 7:20:52 AM

by NSRBX • Contributor

11-17-2022 1:30:49 AM

1225 Views
2 replies
4 kudos

Feature Store - Feature Lookup Engine with join on partial key and Filter

Hello ,I am working with lookupEngine functions. However, we have some feature tables with granularity level most detailled of dataframe input.Please find an example :table A with unique keys on two features : numero_p, numero_s So while performing F...

Machine Learning

Reply

1225 Views
2 replies
4 kudos

11-17-2022 1:30:49 AM

View Replies

Latest Reply

Debayan
Esteemed Contributor III

11-17-2022 11:21:28 PM

4 kudos

Hi @SERET Nathalie , I can check internally on the ask here. In the meantime please let us know if this helps: https://docs.databricks.com/machine-learning/feature-store/feature-tables.htmlhttps://docs.databricks.com/machine-learning/feature-store/i...

4 kudos

11-17-2022 11:21:28 PM

1 More Replies

by Leodatabricks • Contributor

11-17-2022 1:47:51 PM

10733 Views
2 replies
2 kudos

Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.

I have been getting this error sporadically. I'm loading a dataset and training a model using the dataset in notebook. Sometimes it works and sometimes it doesn't. I have seen similar posts and tried all solutions mentioned, log output size limit, sp...

Machine Learning

Reply

10733 Views
2 replies
2 kudos

11-17-2022 1:47:51 PM

View Replies

Latest Reply

karthik_p
Esteemed Contributor

11-17-2022 3:27:22 PM

2 kudos

@Leo Bao Are you seeing this issue whenever you are getting different sizes of data sets, or your data set size is same. if issue you are seeing is due to larger dataset, please check below link and try to increase partition size Databricks Spark Py...

2 kudos

11-17-2022 3:27:22 PM

1 More Replies

by Raymond_Garcia • Contributor II

11-14-2022 9:19:02 AM

1423 Views
1 replies
1 kudos

Resolved! Problem with Autoloader, S3, and wildcard

Hello, I have an autoloader code and it is pretty standard, we have this variable file path that points to an S3 bucket. example #2 executed successfully and example 1 throws an exception.it seems like source 1 always throws an exception whereas sour...

Machine Learning

Reply

1423 Views
1 replies
1 kudos

11-14-2022 9:19:02 AM

View Replies

Latest Reply

Raymond_Garcia
Contributor II

11-16-2022 7:43:14 AM

1 kudos

The error was more related to a lot of stuff on the AWS side, so we deleted and cleared the SQS and SNS. we also used the CloudFilesAWSResourceManagerval manager = CloudFilesAWSResourceManager .newManager .option("path", filePath) .create...

1 kudos

11-16-2022 7:43:14 AM

by elgeo • Valued Contributor II

11-15-2022 3:44:52 AM

973 Views
0 replies
2 kudos

Table name as a parameter in SQL UDF

Hello experts,We would like to create a UDF function with input parameter a table_name. Please check the below simple example:CREATE OR REPLACE FUNCTION F_NAME(v_table_name STRING, v_w...

Machine Learning

Reply

973 Views
0 replies
2 kudos

11-15-2022 3:44:52 AM

by elgeo • Valued Contributor II

11-02-2022 3:42:17 AM

3149 Views
2 replies
3 kudos

Resolved! String data type - Max number of chars

Hello. Could anyone please confirm the maximum number of characters for sql string data type? Thank you in advance

Machine Learning

Reply

3149 Views
2 replies
3 kudos

11-02-2022 3:42:17 AM

View Replies

Latest Reply

Kaniz
Community Manager

11-14-2022 9:00:11 PM

3 kudos

Hi @ELENI GEORGOUSI, The size parameter specifies the column length in characters - can be from 0 to 255. Default is 1. Hope this answer helps you.

3 kudos

11-14-2022 9:00:11 PM

1 More Replies

by bs_77 • New Contributor II

11-13-2022 5:40:26 AM

599 Views
1 replies
3 kudos

Can the HTML behind a SQL visualisations be accessed?

We are using MLFlow to manage the usage of some self service notebooks. This involves logging parameters, tables and figures. Figures are logged using:mlflow.log_figure( figure=fig, artifact_file="visual/fig.html" )Usually the fig object is gener...

Machine Learning

Reply

599 Views
1 replies
3 kudos

11-13-2022 5:40:26 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-14-2022 5:14:05 AM

3 kudos

There is no way to access the html used. You can download the images. The editor uses redash, so you can try looking at that library for more information.

3 kudos

11-14-2022 5:14:05 AM

by fermin_vicente • New Contributor III

11-11-2022 2:12:01 AM

922 Views
2 replies
2 kudos

Is it wise to use a more recent MLFlow Python package version or is the DB Runtime compatibility matrix strict about MLFlow versions?

More concretely, should we fix the dependency version at MAJOR, MINOR or PATCH?For example, MLFlow 1.30.0 is available and latest DBR 11.3 LTS is compatible with 1.29.0 My question comes from the fact that installing our own libraries that use MLFlow...

Machine Learning

Reply

922 Views
2 replies
2 kudos

11-11-2022 2:12:01 AM

View Replies

Latest Reply

fermin_vicente
New Contributor III

11-14-2022 2:06:14 AM

2 kudos

Hi! thanks for the reply, although maybe you didn't notice that I linked to the same url, so we're aware of the matrix. The question is, is it compatible solely with 1.29.0? We want to know which dependency should we use in all our projects that migh...

2 kudos

11-14-2022 2:06:14 AM

1 More Replies

by TomasP • New Contributor III

10-05-2022 1:28:53 AM

806 Views
3 replies
0 kudos

Two or more different ml model on one cluster.

Hi, have you already dealt with the situation that you would like to have two different ml models in one cluster? i.e: I have a project which contains two or more different models with more different pursposes. The goals is to have three differ...

Machine Learning

Reply

806 Views
3 replies
0 kudos

10-05-2022 1:28:53 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-12-2022 10:27:23 PM

0 kudos

Hi @Tomas Peterek Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

0 kudos

11-12-2022 10:27:23 PM

2 More Replies

by Developer35 • New Contributor III

11-02-2022 1:07:18 AM

1294 Views
5 replies
3 kudos

Resolved! Certification badge not received

I have completed the Databricks certified Data Engineer Associate exam on 29th October, received a mail with score and it is mentioned in mail that I would receive badge within 24 hours. It has been 4 days since I completed the exam still certificate...

Machine Learning

Reply

1294 Views
5 replies
3 kudos

11-02-2022 1:07:18 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-10-2022 3:54:25 AM

3 kudos

Hi @Manasa Tanguturu Hope everything is going great.Just wanted to check in if you were able to resolve your issue.If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please visit th...

3 kudos

11-10-2022 3:54:25 AM

4 More Replies

by eshaanpathak • New Contributor III

11-07-2022 4:12:27 PM

1937 Views
3 replies
4 kudos

AttributeError: 'NoneType' object has no attribute 'enum_types_by_name'

I run into this error while using MLFlow: AttributeError: 'NoneType' object has no attribute 'enum_types_by_name'Here is the relevant stack trace:/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/mlflow/tracking/fluent....

Machine Learning

Reply

1937 Views
3 replies
4 kudos

11-07-2022 4:12:27 PM

View Replies

Latest Reply

Kaniz
Community Manager

11-09-2022 2:09:22 AM

4 kudos

Hi @Eshaan Pathak , We haven’t heard from you on the last response from @Debayan Mukherjee, and I was checking back to see if their suggestions helped you. Or else, If you have any solution, please do share that with the community as it can be hel...

4 kudos

11-09-2022 2:09:22 AM

2 More Replies

by THIAM_HUATTAN • Valued Contributor

11-09-2022 4:49:17 AM

349 Views
0 replies
1 kudos

docs.azure.cn

https://docs.azure.cn/en-us/databricks/_static/notebooks/mlflow/mlflow-end-to-end-example-azure.htmlI imported the above notebook and try in Databricks community, but those subplots for Box plots are giving me errors as below:AttributeError ...

Machine Learning

Reply

349 Views
0 replies
1 kudos

11-09-2022 4:49:17 AM

by brendanmckenna • New Contributor III

11-03-2022 9:41:00 AM

1514 Views
4 replies
4 kudos

Resolved! How to avoid an error when using the automl python api on a classification problem

I am working through a basic example to get familiar with databricks automl. When I run classify, I hit an mlflow error. How can I avoid this error? My code:summary = databricks.automl.classify(train_df, target_col='new_cases', data_dir='dbfs:/automl...

Machine Learning

Reply

1514 Views
4 replies
4 kudos

11-03-2022 9:41:00 AM

View Replies

Latest Reply

Kaniz
Community Manager

11-04-2022 8:43:25 AM

4 kudos

Hi @Brendan McKenna , We haven’t heard from you since the last response from @Debayan Mukherjee. Or else, If you have any solution, please share it with the community, as it can be helpful to others. Also, Please don't forget to click on the "Selec...

4 kudos

11-04-2022 8:43:25 AM

3 More Replies

Databricks

Forum Posts

Resolved! Error when reading Excel file: "org.apache.poi.ooxml.POIXMLException: Strict OOXML isn't currently supported, please see bug #57699"

medium.datadriveninvestor.com

How to get the Probability of a prediction from Real Time Inference model

Feature Store - Feature Lookup Engine with join on partial key and Filter

Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.

Resolved! Problem with Autoloader, S3, and wildcard

Table name as a parameter in SQL UDF

Resolved! String data type - Max number of chars

Can the HTML behind a SQL visualisations be accessed?

Is it wise to use a more recent MLFlow Python package version or is the DB Runtime compatibility matrix strict about MLFlow versions?

Two or more different ml model on one cluster.

Resolved! Certification badge not received

AttributeError: 'NoneType' object has no attribute 'enum_types_by_name'

docs.azure.cn

Resolved! How to avoid an error when using the automl python api on a classification problem

pdb debugger on databricks

import ml.dmlc.xgboost4j.scala.spark.{XGBoostEstim...

Query ML Endpoint with R and Curl

'error_code': 'INVALID_PARAMETER_VALUE', 'message'...

AutoMl Dataset too large