cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Science & Machine Learning

Forum Posts

Siebert_Looije
by Contributor
  • 859 Views
  • 1 replies
  • 1 kudos

What is the best way to deal with pymc3 in MLFLOW models in databricks?

Last week, we started with using mlflow within databricks. The bayesian models that we are using right now are the pymc3 models (https://docs.pymc.io/en/v3/index.html).We could use the experiment feature of databricks/mlflow to save the models as an ...

  • 859 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Siebert Looije​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else bricksters will get back to you soon. Thanks.

  • 1 kudos
jonathan-dufaul
by Valued Contributor
  • 774 Views
  • 0 replies
  • 1 kudos

is it possible to change the boilerplate code on a logged/saved pyfunc mlflow model?

When I log a pyfunc mlflow model, it generates a page that has this helpful code for using the model in production. Make Predictions Predict on a Spark DataFrame: import mlflow from pyspark.sql.functions import struct, col logged_model = 'runs:/1d......

  • 774 Views
  • 0 replies
  • 1 kudos
Slalom_Tobias
by New Contributor III
  • 2253 Views
  • 3 replies
  • 3 kudos

Resolved! ML Practioner | ML 11 - XGBoost notebook | cannot import keras.applications.resnet50

the following code...from sparkdl.xgboost import XgboostRegressorfrom pyspark.ml import Pipelineparams = {"n_estimators": 100, "learning_rate": 0.1, "max_depth": 4, "random_state": 42, "missing": 0}xgboost = XgboostRegressor(**params)pipeline = Pipel...

  • 2253 Views
  • 3 replies
  • 3 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 3 kudos

You need to choose the runtime for ML instead of the standard.

  • 3 kudos
2 More Replies
elgeo
by Valued Contributor II
  • 1228 Views
  • 1 replies
  • 1 kudos

Pass parameter from python to SQL - Null result

Hello. Could someone please explain at the below example, why having the prefix "da" at the parameter name allows us to select the parameter value but not having it returns to a null value?Correct valueNull value Thank you in advance

UDF_OK UDF_NOT_OK
  • 1228 Views
  • 1 replies
  • 1 kudos
Latest Reply
elgeo
Valued Contributor II
  • 1 kudos

Any insight on this? Thank you!

  • 1 kudos
anthonylavado
by New Contributor III
  • 1605 Views
  • 3 replies
  • 7 kudos

Can't Add Cluster-scoped Init Script to Model Serving Cluster

Similar to this other question: https://community.databricks.com/s/question/0D58Y00008hahwuSAA/cant-edit-the-cluster-created-by-mlflow-model-servingWe're using Azure Databricks, and have a model that requires a WHL to be downloaded from a private add...

  • 1605 Views
  • 3 replies
  • 7 kudos
Latest Reply
939772
New Contributor III
  • 7 kudos

Has anyone had success with this? Trying to solve a resolve issue.

  • 7 kudos
2 More Replies
ianchenmu
by New Contributor II
  • 3026 Views
  • 5 replies
  • 7 kudos

Parallelization in training machine learning models using MLFlow

I'm training a ML model (e.g., XGboost) and I have a large combination of 5 hyperparameters, say each parameter has 5 candidates, it will be 5^5 = 3,125 combos.Now I want to do parallelization for the grid search on all the hyperparameter combos for ...

  • 3026 Views
  • 5 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

Hi @Chen Mu​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 7 kudos
4 More Replies
Mado
by Valued Contributor II
  • 9248 Views
  • 5 replies
  • 17 kudos

Resolved! Error when reading Excel file: "java.lang.NoClassDefFoundError: shadeio/poi/schemas/vmldrawing/XmlDocument"

Hi, I want to read an Excel file by:filepath_xlsx = "dbfs:/FileStore/data.xlsx"       sampleDF = (spark.read.format("com.crealytics.spark.excel")   .option("Header", "true")   .option("inferSchema", "false")   .option("treatEmptyValuesAsNulls", ...

  • 9248 Views
  • 5 replies
  • 17 kudos
Latest Reply
Mado
Valued Contributor II
  • 17 kudos

For this dataset, I also tried binary file reading as below: xldf_xlsx = ( spark.read.format("binaryFile") .option("pathGlobFilter", "*.xls*") .load(filepath_xlsx) )   excel_content = xldf_xlsx.head(1)[0].content file_like_obj = io.BytesIO(excel...

  • 17 kudos
4 More Replies
Mado
by Valued Contributor II
  • 3105 Views
  • 1 replies
  • 4 kudos

Resolved! Error when reading Excel file: "org.apache.poi.ooxml.POIXMLException: Strict OOXML isn't currently supported, please see bug #57699"

Hi,I want to read an Excel "xlsx" file. The excel file has several sheets and multi-row header. The original file format was "xlsm" and I changed the extension to "xlsx". I try the following code:filepath_xlsx = "dbfs:/FileStore/Sample_Excel/data.xl...

  • 3105 Views
  • 1 replies
  • 4 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 4 kudos

Hi @Mohammad Saber​, The error says, Don't save your spreadsheet in "strict OOXML" format.For example, in Excel use.Save As --> "Excel Workbook (.xlsx)" instead ofSave As --> "Strict Open XML Spreadsheet (.xlsx)"

  • 4 kudos
THIAM_HUATTAN
by Valued Contributor
  • 1062 Views
  • 3 replies
  • 1 kudos

medium.datadriveninvestor.com

say, I want to download 2 files from this directory (dbfs:/databricks-datasets/abc-quality/") to my local filesystem, how do I do it?I understand that if those files are inside FileStore directory, it is much straightforward, which someone posts some...

  • 1062 Views
  • 3 replies
  • 1 kudos
Latest Reply
Pat
Honored Contributor III
  • 1 kudos

Hi @THIAM HUAT TAN​ ,isn't this dbfs://databricks-datasets Databricks owned s3:// mounted to the workspace? You got an error - 403 access denied to PUT files into the s3 bucket: https://databricks-datasets-oregon.s3.us-west-2.amazonaws.com you shoul...

  • 1 kudos
2 More Replies
NSRBX
by Contributor
  • 1866 Views
  • 2 replies
  • 4 kudos

Feature Store - Feature Lookup Engine with join on partial key and Filter

Hello ,I am working with lookupEngine functions. However, we have some feature tables with granularity level most detailled of dataframe input.Please find an example :table A with unique keys on two features : numero_p, numero_s So while performing F...

  • 1866 Views
  • 2 replies
  • 4 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 4 kudos

Hi @SERET Nathalie​ , I can check internally on the ask here. In the meantime please let us know if this helps: https://docs.databricks.com/machine-learning/feature-store/feature-tables.htmlhttps://docs.databricks.com/machine-learning/feature-store/i...

  • 4 kudos
1 More Replies
Leodatabricks
by Contributor
  • 12678 Views
  • 2 replies
  • 2 kudos

Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.

I have been getting this error sporadically. I'm loading a dataset and training a model using the dataset in notebook. Sometimes it works and sometimes it doesn't. I have seen similar posts and tried all solutions mentioned, log output size limit, sp...

  • 12678 Views
  • 2 replies
  • 2 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 2 kudos

@Leo Bao​ Are you seeing this issue whenever you are getting different sizes of data sets, or your data set size is same. if issue you are seeing is due to larger dataset, please check below link and try to increase partition size Databricks Spark Py...

  • 2 kudos
1 More Replies
Raymond_Garcia
by Contributor II
  • 1979 Views
  • 1 replies
  • 1 kudos

Resolved! Problem with Autoloader, S3, and wildcard

Hello, I have an autoloader code and it is pretty standard, we have this variable file path that points to an S3 bucket. example #2 executed successfully and example 1 throws an exception.it seems like source 1 always throws an exception whereas sour...

  • 1979 Views
  • 1 replies
  • 1 kudos
Latest Reply
Raymond_Garcia
Contributor II
  • 1 kudos

The error was more related to a lot of stuff on the AWS side, so we deleted and cleared the SQS and SNS. we also used the CloudFilesAWSResourceManagerval manager = CloudFilesAWSResourceManager .newManager .option("path", filePath) .create...

  • 1 kudos
elgeo
by Valued Contributor II
  • 4642 Views
  • 2 replies
  • 3 kudos

Resolved! String data type - Max number of chars

Hello. Could anyone please confirm the maximum number of characters for sql string data type? Thank you in advance

  • 4642 Views
  • 2 replies
  • 3 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 3 kudos

Hi @ELENI GEORGOUSI​, The size parameter specifies the column length in characters - can be from 0 to 255. Default is 1. Hope this answer helps you.

  • 3 kudos
1 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels