Followed the documentation and facing issue while running dbx execute on all-purpose/interactive cluster, which is up and running already. Ran this command dbx execute --cluster-id=XXXXXX --job=dbx-demo-job --no-rebuild --debug. If anyone faced it ...
We are building the feature store using databricks API. Few of the machine learning engineers are using Jupyter notebooks. Is it possible to use feature store outside databricks?
Hi @Kaniz Fatma​ and @Jose Gonzalez​ ,turning back to the original question, and considering that one of the main benefits of the Feature Store is the removal of the online/offline skew, how could I access to the features from a client application l...
I have an mlflow server with `--serve-artifacts` and with postgres as `--backend-store-uri`. The machine(container with base image python:3.9-bullseye) running the server has git installed which is available on path. I am logging from jupyter-noteboo...
Hi @Naveen Marthala​ ​ , Just a friendly follow-up. Do you still need help or the above responses help you to find the solution? Please let us know.
I am logging runs from jupyter notebook. the cells which has `mlflow.sklearn.autlog()` behaves as expected. but, the cells which has .fit() method being called on sklearn's estimators are also being logged as runs without explicitly mentioning `mlflo...
Should I pip install xgboost==1.4.2. (the last version it worked) or is there a better way to solve it having in mind that this solution might cause problems later if this version of xgboost is not supported on future python versions.
I am running a 3-fold cross validation of an ML pipeline that utilizes `GBTClassifier` as the final step. It takes 18 hours to run and I am looking for feedback into how to improve the performance as I expect this to go faster.For context here is the...
Hi @Assaad Mrad​ , Just a friendly follow-up. Do you still need help, or @Chris Chalcraft​ 's response help you to find the solution? Please let us know.
I built a model which is used for ranking and I have a notebook that takes that model to generate rankings and then uses a UDF-based metric to evaluate those rankings. Is there any way that I can have this ranking / UDF be used during the AutoML trai...
I'm using Databricks Community Edition for testing purposes on a OSS project.I'm spinning up the cluster automatically through Databricks Clusters API.The automated tests rely on AWS S3 infrastructure, reason why I need to mount the S3 bucket on the ...
I haven't found any solution.I'm assuming that currently my only option is the usage of Databricks Enterprise to model scenarios involving the mounting of object storage buckets.
No error, just seeing the EXPAND DISK in cluster event logs. This is just a regular spark application. I am not sure if the cloud storage matters - a spark application uses it as input and output.
I want to use Databricks Online Store with Azure SQL Database, however I am unable to autenthicate through Databricks Feature Store API. I need to use Service Principal credentials.I tried using Application ID as username and Secret as password, but ...
Hello,I would like to set the default "spark.driver.maxResultSize" from the notebook on my cluster. I know I can do that in the cluster settings, but is there a way to set it by code?I also know how to do it when I start a spark session, but in my ca...
Hi @Maximilian Hansinger​ Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark the answer as best? If not, please tell us so we can help you.Thanks!
Hello everybody..I am trying to run pymc3 models on Databricks (runtime 9.1) and when I start the sampling process, the progress bar is not showing. It is a bit annoying since this way I do not have any information on when the process is going to end...
Is it possible to create mlflow model as a docker image with REST api endpoint and use it for inferencing within databricks or hosting the image in azure container instances?
Hey there @Vivek Ranjan​ Checking in. If Joseph's answer helped, would you let us know and mark the answer as best? It would be really helpful for the other members to find the solution more quickly.Thanks!
kafkashaded.org.apache.kafka.common.KafkaException: Failed to construct kafka consumer at kafkashaded.org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:823) at kafkashaded.org.apache.kafka.clients.consumer.KafkaConsumer.<init>...
@Kaniz Fatma​ I am having the same issue.%python
import pyspark.sql.functions as fn
from pyspark.sql.types import StringType
binary_to_string = fn.udf(lambda x: str(int.from_bytes(x, byteorder='big')), StringType())
df = spark.readStream.format("...