cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

NhatHoang
by Valued Contributor II
  • 7210 Views
  • 3 replies
  • 15 kudos

Resolved! Do One-Hot-Encoding (OHE) before or after split data to train and test dataframe

Hi,I wonder that I should do OHE before or after I split data to build up a ML model.Please give some advise.

  • 7210 Views
  • 3 replies
  • 15 kudos
Latest Reply
LandanG
Databricks Employee
  • 15 kudos

Hi @Nhat Hoang​ ,While not Databricks-specific, here's a good answer:"If you perform the encoding before the split, it will lead to data leakage (train-test contamination). In this sense, you will introduce new data (integers of Label Encoders) and u...

  • 15 kudos
2 More Replies
garymm
by New Contributor
  • 789 Views
  • 0 replies
  • 0 kudos

Databricks-hosted MLFlow ignores the `mlflow.user` tag when set in the "runs/create" REST API call and doesn't let me change it after a ...

Databricks-hosted MLFlow ignores the `mlflow.user` tag when set in the "runs/create" REST API call and doesn't let me change it after a run is created.Open source MLFlow respects the tag.Could you please change the server to respect this field when i...

  • 789 Views
  • 0 replies
  • 0 kudos
rjwswenson
by New Contributor II
  • 5808 Views
  • 7 replies
  • 15 kudos

What programming frameworks and languages can be used with Databricks Feature Store

To leverage Databricks feature store, can only Python be utilized? If otherwise, what other language frameworks are supported. Below is my question in 2 partsPart 1) What languages can be utilized to write data frames as feature tables in the Feature...

  • 5808 Views
  • 7 replies
  • 15 kudos
Latest Reply
boyelana
Contributor III
  • 15 kudos

you can use any of these languages Python, SQL, Scala and R

  • 15 kudos
6 More Replies
User16752245767
by Contributor
  • 2499 Views
  • 3 replies
  • 10 kudos

I am Avi, a Solutions Architect at Databricks. We have built an application to demonstrate how AI-capabilities could be easily integrated to deliver n...

I am Avi, a Solutions Architect at Databricks. We have built an application to demonstrate how AI-capabilities could be easily integrated to deliver novel user experiences. The application allows users to submit images and text, and uses these inputs...

  • 2499 Views
  • 3 replies
  • 10 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 10 kudos

Hi @Avinash Sooriyarachchi​ Thanks for sharing it.

  • 10 kudos
2 More Replies
TomasP
by New Contributor III
  • 2437 Views
  • 3 replies
  • 1 kudos

inability to consume model

Hello, I would like to ask where the problem may be.   I want to create a real time endpoint to real time model infering.  . i have created a simple cluster but i am not able to deploy the model i still get a yellow status - pending.  the whole pro...

image.png
  • 2437 Views
  • 3 replies
  • 1 kudos
Latest Reply
TomasP
New Contributor III
  • 1 kudos

Hi, already solved.... it was just wrong selected runtime

  • 1 kudos
2 More Replies
User16752245767
by Contributor
  • 1501 Views
  • 0 replies
  • 5 kudos

youtu.be

I'm Avi, a Solutions Architect at Databricks working at the intersection of Data Engineering and Machine Learning.Streaming data processing has moved from niche to mainstream, and deploying machine learning models in such data streams opens up a mult...

  • 1501 Views
  • 0 replies
  • 5 kudos
Kristof
by New Contributor III
  • 8630 Views
  • 3 replies
  • 3 kudos

Resolved! Spark Error/Exception Handling

I am creating new application and looking for ideas how to handle exceptions in Spark, for example ThreadPoolExecution. Are there any good practice in terms of error handling and dealing with specific exceptions ?

  • 8630 Views
  • 3 replies
  • 3 kudos
Latest Reply
Shalabh007
Honored Contributor
  • 3 kudos

@Krzysztof Nojman​ Can you please click on the "Select As Best" button if you find the information provided helps resolve your question.

  • 3 kudos
2 More Replies
matte
by New Contributor III
  • 15217 Views
  • 7 replies
  • 16 kudos

Resolved! Way of using pymc.model_to_graphviz into a Databricks notebook

Hi everybody,I created a simple bayesian model using the pymc library in Python. I would like to graphically represent my model using the pymc.model_to_graphviz(model=model) method.However, it seems it does not work within a databrcks notebook, even ...

  • 15217 Views
  • 7 replies
  • 16 kudos
Latest Reply
Own
Contributor
  • 16 kudos

%sh apt install -y graphviz

  • 16 kudos
6 More Replies
elgeo
by Valued Contributor II
  • 5559 Views
  • 1 replies
  • 4 kudos

Resolved! Insert into delta table fails

Hello experts. We are trying to execute an insert command with less columns than the target table:Insert into table_name( col1, col2, col10)Select col1, col2, col10from table_name2However the above fails with:Error in SQL statement: DeltaAnalysisExce...

  • 5559 Views
  • 1 replies
  • 4 kudos
Latest Reply
UmaMahesh1
Honored Contributor III
  • 4 kudos

Hi @ELENI GEORGOUSI​ Yes. When you are doing an insert, your provided schema should match with the target schema else it would throw an error.But you can still insert the data using another approach. Create a dataframe with your data having less colu...

  • 4 kudos
jonathan-dufaul
by Valued Contributor
  • 2703 Views
  • 3 replies
  • 1 kudos

How does mlflow determine if a pyfunc model uses SparkContext?

I've been getting this error pretty regularly while working with mlflow:"It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that ...

  • 2703 Views
  • 3 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

I checked the page and it looks like there is no integration with Datarobot and Datarobot doesn't contribute to mlflow. https://mlflow.org/ has all the integrations listed

  • 1 kudos
2 More Replies
ajeet1080
by New Contributor III
  • 3300 Views
  • 1 replies
  • 2 kudos

Resolved! Unable to create feature table using databricks API .FeatureStoreClient()

I am following example steps from databricks documentation https://docs.databricks.com/_static/notebooks/machine-learning/feature-store-taxi-example.htmlI am using Feature Store client v0.3.6 and above.However on trying to create feature table with f...

Screenshot 2022-11-29 at 2.10.43 PM pickup_features dataframe screenshot dropoff_features dataframe screenshot
  • 3300 Views
  • 1 replies
  • 2 kudos
Latest Reply
ajeet1080
New Contributor III
  • 2 kudos

After much digging, observed i was using standard runtime. Once i switched to ML runtime of databricks, issue was resolved. To use Feature Store capability, ensure that you select a Databricks Runtime ML version from the Databricks Runtime Version dr...

  • 2 kudos
MA
by New Contributor II
  • 1863 Views
  • 1 replies
  • 4 kudos

Stream data from Delta tables replicated with Fivetran into DLT

I'm attempting to stream into a DLT pipeline with data replicated from Fivetran directly into Delta tables in another database than the one that the DLT pipeline uses. This is not an aggregate, and I don't want to recompute the entire data model eac...

  • 1863 Views
  • 1 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @M A​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else bricksters will get back to you soon. Thanks

  • 4 kudos
Siebert_Looije
by Contributor
  • 1599 Views
  • 1 replies
  • 1 kudos

What is the best way to deal with pymc3 in MLFLOW models in databricks?

Last week, we started with using mlflow within databricks. The bayesian models that we are using right now are the pymc3 models (https://docs.pymc.io/en/v3/index.html).We could use the experiment feature of databricks/mlflow to save the models as an ...

  • 1599 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Siebert Looije​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else bricksters will get back to you soon. Thanks.

  • 1 kudos
jonathan-dufaul
by Valued Contributor
  • 1251 Views
  • 0 replies
  • 1 kudos

is it possible to change the boilerplate code on a logged/saved pyfunc mlflow model?

When I log a pyfunc mlflow model, it generates a page that has this helpful code for using the model in production. Make Predictions Predict on a Spark DataFrame: import mlflow from pyspark.sql.functions import struct, col logged_model = 'runs:/1d......

  • 1251 Views
  • 0 replies
  • 1 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels