cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Anonymous
by Not applicable
  • 4673 Views
  • 4 replies
  • 1 kudos

Spark connector to mongodb - mongo-spark-connector_2.12:10.1.1

Hello, I´ve added a library to the cluster and it appears in SPARK UI as Added By Userspark://10.139.64.4:43001/jars/addedFile307892533757162075org_mongodb_spark_mongo_spark_connector_2_12_10_1_1-98946.jarAdded By UserI'm trying to connect using the ...

  • 4673 Views
  • 4 replies
  • 1 kudos
Latest Reply
FurqanAmin
New Contributor II
  • 1 kudos

@DmytroSokhach  I think it works if you change mongo to mongodb in the options. and use spark.mongodb.read.connection.uri instead of spark.mongodb.input.uri as @silvadev suggested.

  • 1 kudos
3 More Replies
jcapplefields88
by New Contributor II
  • 992 Views
  • 1 replies
  • 1 kudos

Expose low latency APIs from Deltalake for mobile apps and microservices

My company is using Deltalake to extract customer insights and run batch scoring with ML models. I need to expose this data to some microservices thru gRPC and REST APIs. How to do this? I'm thinking to build Spark pipelines to extract teh data, stor...

  • 992 Views
  • 1 replies
  • 1 kudos
Latest Reply
stefnhuy
New Contributor III
  • 1 kudos

Hey everyone It's awesome that your company is utilizing Deltalake for extracting customer insights and running batch scoring with ML models. I can totally relate to the excitement and challenges of dealing with data integration for microservices and...

  • 1 kudos
ptawil
by New Contributor III
  • 1979 Views
  • 2 replies
  • 4 kudos

Runtime error using MLFlow and Spark on databricks

Here is some model I created:class SomeModel(mlflow.pyfunc.PythonModel): def predict(self, context, input): # do fancy ML stuff # log results pandas_df = pd.DataFrame(...insert predictions here...) spark_df = spark...

  • 1979 Views
  • 2 replies
  • 4 kudos
Latest Reply
Nikhil3107
New Contributor III
  • 4 kudos

Any updates on this? I am running into the same issue@Patrick Tawil​ were you able to solve this problem? If so, do you mind sharing?

  • 4 kudos
1 More Replies
ryojikn
by New Contributor III
  • 2957 Views
  • 2 replies
  • 0 kudos

How to use spark-submit python task with the usage of --archives parameter passing a .tar.gz conda env?

We've been trying to launch a spark-submit python task using the parameter "archives", similar to that one used in Yarn.​However, we've not been able to successfully make it work in databricks.​​We know that for our OnPrem installation we can use som...

  • 2957 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Ryoji Kuwae Neto​ :To use the --archives parameter with a conda environment in Databricks, you can follow these steps:1) Create a conda environment for your project and export it as a .tar.gz file:conda create --name myenv conda activate myenv conda...

  • 0 kudos
1 More Replies
gaponte
by New Contributor III
  • 1224 Views
  • 2 replies
  • 1 kudos

Resolved! What are the best resources for learning how to tune/optimize Spark?

I know this question/topic is not very specific, but perhaps it asking the question would be useful for people other than me.I am a newbie to Spark, and while I've been able to get my current model training and data transformations running, they are ...

  • 1224 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Greg Aponte​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

  • 1 kudos
1 More Replies
KenAN
by New Contributor II
  • 1873 Views
  • 3 replies
  • 3 kudos

How to circumvent Py4JSecurityException for spark-nlp : Constructor public com.johnsnowlabs.nlp.***(java.lang.String) is not whitelisted.

Running into the following error on our company's cluster. py4j.security.Py4JSecurityException: Constructor public com.johnsnowlabs.nlp.DocumentAssembler(java.lang.String) is not whitelisted.For the following code(which is just tutorial code from the...

  • 1873 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Kenan Spruill​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

  • 3 kudos
2 More Replies
Kristof
by New Contributor III
  • 4893 Views
  • 3 replies
  • 3 kudos

Resolved! Spark Error/Exception Handling

I am creating new application and looking for ideas how to handle exceptions in Spark, for example ThreadPoolExecution. Are there any good practice in terms of error handling and dealing with specific exceptions ?

  • 4893 Views
  • 3 replies
  • 3 kudos
Latest Reply
Shalabh007
Honored Contributor
  • 3 kudos

@Krzysztof Nojman​ Can you please click on the "Select As Best" button if you find the information provided helps resolve your question.

  • 3 kudos
2 More Replies
jcapplefields88
by New Contributor II
  • 970 Views
  • 3 replies
  • 1 kudos

Expose low latency APIs from Deltalake for mobile apps and microservices

My company is using Deltalake to extract customer insights and run batch scoring with ML models. I need to expose this data to some microservices thru gRPC and REST APIs. How to do this? I'm thinking to build Spark pipelines to extract teh data, stor...

  • 970 Views
  • 3 replies
  • 1 kudos
Latest Reply
Noopur_Nigam
Valued Contributor II
  • 1 kudos

Hi @John Capplefield​ Gentle follow-up, please let us know if you need further help on this.

  • 1 kudos
2 More Replies
Slalom_Tobias
by New Contributor III
  • 1751 Views
  • 3 replies
  • 0 kudos

Cannot serialize this model error when attempting MLFlow for SparkNLP

I'm attempting to use MLFlow to register models in Databricks and am following the recipe at...https://nlp.johnsnowlabs.com/docs/en/licensed_serving_spark_nlp_via_api_databricks_mlflowwhen i execute...mlflow.spark.log_model(pipeline, "lemmatizer", co...

  • 1751 Views
  • 3 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi @Tobias Cortese​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 0 kudos
2 More Replies
jnjns
by New Contributor II
  • 668 Views
  • 0 replies
  • 3 kudos

Java Error for installation rasterframes

Hi all,I have followed the steps in this notebook to install rasterframes on my databricks cluster.Eventually I am able to import the following:from pyrasterframes import rf_ipython from pyrasterframes.utils import create_rf_spark_session from pyspar...

  • 668 Views
  • 0 replies
  • 3 kudos
MadelynM
by New Contributor III
  • 1129 Views
  • 0 replies
  • 2 kudos

2021-08-Best-Practices-for-Your-Data-Architecture-v3-OG-1200x628

Thanks to everyone who joined the Best Practices for Your Data Architecture session on Optimizing Data Performance. You can access the on-demand session recording here and the pre-run performance benchmarks using the Spark UI Simulator. Proper cluste...

  • 1129 Views
  • 0 replies
  • 2 kudos
eyalwir
by New Contributor
  • 533 Views
  • 0 replies
  • 0 kudos

Deep Learning on Spark within AWS EMR

I'd like to use Deep Learning on Spark within AWS EMR.Is there a best practice or 'recommended' DL framework to run on Spark? It looks like Databricks' spark-deep-learning has been replaced by Horovod—should this the first option to consider? If th...

  • 533 Views
  • 0 replies
  • 0 kudos
Srikanth_Gupta_
by Valued Contributor
  • 1039 Views
  • 1 replies
  • 1 kudos

What are best NLP libraries to use with Spark

Best NLP APIs to use with Spark which gives better performance

  • 1039 Views
  • 1 replies
  • 1 kudos
Latest Reply
sean_owen
Honored Contributor II
  • 1 kudos

By far the most popular and comprehensive library, to my knowledge, for Spark-native distributed NLP, is spark-nlp from John Snow Labs. https://nlp.johnsnowlabs.com/ It is open source (but with commercial support options) and has a whole lot of funct...

  • 1 kudos
Labels