Machine Learning

by Anonymous • Not applicable

04-11-2023 2:49:02 PM

11082 Views
4 replies
1 kudos

Spark connector to mongodb - mongo-spark-connector_2.12:10.1.1

Hello, I´ve added a library to the cluster and it appears in SPARK UI as Added By Userspark://10.139.64.4:43001/jars/addedFile307892533757162075org_mongodb_spark_mongo_spark_connector_2_12_10_1_1-98946.jarAdded By UserI'm trying to connect using the ...

Machine Learning

Reply

11082 Views
4 replies
1 kudos

04-11-2023 2:49:02 PM

View Replies

Latest Reply

FurqanAmin
New Contributor II

10-30-2023 4:46:01 AM

1 kudos

@DmytroSokhach I think it works if you change mongo to mongodb in the options. and use spark.mongodb.read.connection.uri instead of spark.mongodb.input.uri as @silvadev suggested.

1 kudos

10-30-2023 4:46:01 AM

3 More Replies

by jcapplefields88 • New Contributor II

06-10-2022 7:38:21 AM

2872 Views
1 replies
1 kudos

Expose low latency APIs from Deltalake for mobile apps and microservices

My company is using Deltalake to extract customer insights and run batch scoring with ML models. I need to expose this data to some microservices thru gRPC and REST APIs. How to do this? I'm thinking to build Spark pipelines to extract teh data, stor...

Machine Learning

Reply

2872 Views
1 replies
1 kudos

06-10-2022 7:38:21 AM

View Replies

Latest Reply

stefnhuy
New Contributor III

08-14-2023 5:12:45 AM

1 kudos

Hey everyone It's awesome that your company is utilizing Deltalake for extracting customer insights and running batch scoring with ML models. I can totally relate to the excitement and challenges of dealing with data integration for microservices and...

1 kudos

08-14-2023 5:12:45 AM

by ptawil • New Contributor III

07-07-2022 8:49:48 AM

3975 Views
2 replies
4 kudos

Runtime error using MLFlow and Spark on databricks

Here is some model I created:class SomeModel(mlflow.pyfunc.PythonModel): def predict(self, context, input): # do fancy ML stuff # log results pandas_df = pd.DataFrame(...insert predictions here...) spark_df = spark...

Machine Learning

Reply

3975 Views
2 replies
4 kudos

07-07-2022 8:49:48 AM

View Replies

Latest Reply

Nikhil3107
New Contributor III

06-07-2023 8:08:01 AM

4 kudos

Any updates on this? I am running into the same issue@Patrick Tawil were you able to solve this problem? If so, do you mind sharing?

4 kudos

06-07-2023 8:08:01 AM

1 More Replies

by ryojikn • New Contributor III

01-30-2023 8:52:24 AM

6940 Views
2 replies
0 kudos

How to use spark-submit python task with the usage of --archives parameter passing a .tar.gz conda env?

We've been trying to launch a spark-submit python task using the parameter "archives", similar to that one used in Yarn.However, we've not been able to successfully make it work in databricks.We know that for our OnPrem installation we can use som...

Machine Learning

Reply

6940 Views
2 replies
0 kudos

01-30-2023 8:52:24 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-10-2023 7:04:06 AM

0 kudos

@Ryoji Kuwae Neto :To use the --archives parameter with a conda environment in Databricks, you can follow these steps:1) Create a conda environment for your project and export it as a .tar.gz file:conda create --name myenv conda activate myenv conda...

0 kudos

04-10-2023 7:04:06 AM

1 More Replies

by gaponte • New Contributor III

02-16-2023 9:47:29 AM

3342 Views
2 replies
1 kudos

Resolved! What are the best resources for learning how to tune/optimize Spark?

I know this question/topic is not very specific, but perhaps it asking the question would be useful for people other than me.I am a newbie to Spark, and while I've been able to get my current model training and data transformations running, they are ...

Machine Learning

Reply

3342 Views
2 replies
1 kudos

02-16-2023 9:47:29 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-10-2023 7:37:55 PM

1 kudos

Hi @Greg Aponte Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

1 kudos

03-10-2023 7:37:55 PM

1 More Replies

by KenAN • New Contributor II

10-12-2022 3:21:31 PM

4239 Views
3 replies
3 kudos

How to circumvent Py4JSecurityException for spark-nlp : Constructor public com.johnsnowlabs.nlp.***(java.lang.String) is not whitelisted.

Running into the following error on our company's cluster. py4j.security.Py4JSecurityException: Constructor public com.johnsnowlabs.nlp.DocumentAssembler(java.lang.String) is not whitelisted.For the following code(which is just tutorial code from the...

Machine Learning

Reply

4239 Views
3 replies
3 kudos

10-12-2022 3:21:31 PM

View Replies

Latest Reply

Anonymous
Not applicable

11-27-2022 4:50:21 AM

3 kudos

Hi @Kenan Spruill Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

3 kudos

11-27-2022 4:50:21 AM

2 More Replies

by Kristof • New Contributor III

10-24-2022 11:35:38 PM

9196 Views
3 replies
3 kudos

Resolved! Spark Error/Exception Handling

I am creating new application and looking for ideas how to handle exceptions in Spark, for example ThreadPoolExecution. Are there any good practice in terms of error handling and dealing with specific exceptions ?

Machine Learning

Reply

9196 Views
3 replies
3 kudos

10-24-2022 11:35:38 PM

View Replies

Latest Reply

Shalabh007
Honored Contributor

12-02-2022 8:18:40 AM

3 kudos

@Krzysztof Nojman Can you please click on the "Select As Best" button if you find the information provided helps resolve your question.

3 kudos

12-02-2022 8:18:40 AM

2 More Replies

by jcapplefields88 • New Contributor II

06-10-2022 7:37:11 AM

2321 Views
3 replies
1 kudos

Expose low latency APIs from Deltalake for mobile apps and microservices

My company is using Deltalake to extract customer insights and run batch scoring with ML models. I need to expose this data to some microservices thru gRPC and REST APIs. How to do this? I'm thinking to build Spark pipelines to extract teh data, stor...

Machine Learning

Reply

2321 Views
3 replies
1 kudos

06-10-2022 7:37:11 AM

View Replies

Latest Reply

Noopur_Nigam
Databricks Employee

10-02-2022 11:53:30 PM

1 kudos

Hi @John Capplefield Gentle follow-up, please let us know if you need further help on this.

1 kudos

10-02-2022 11:53:30 PM

2 More Replies

by Slalom_Tobias • New Contributor III

08-30-2022 5:34:04 PM

4542 Views
2 replies
0 kudos

Cannot serialize this model error when attempting MLFlow for SparkNLP

I'm attempting to use MLFlow to register models in Databricks and am following the recipe at...https://nlp.johnsnowlabs.com/docs/en/licensed_serving_spark_nlp_via_api_databricks_mlflowwhen i execute...mlflow.spark.log_model(pipeline, "lemmatizer", co...

Machine Learning

Reply

4542 Views
2 replies
0 kudos

08-30-2022 5:34:04 PM

View Replies

Latest Reply

Vidula
Honored Contributor

09-17-2022 12:46:02 AM

0 kudos

Hi @Tobias Cortese Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

0 kudos

09-17-2022 12:46:02 AM

1 More Replies

by jnjns • New Contributor II

07-27-2022 6:23:08 AM

1634 Views
0 replies
3 kudos

Java Error for installation rasterframes

Hi all,I have followed the steps in this notebook to install rasterframes on my databricks cluster.Eventually I am able to import the following:from pyrasterframes import rf_ipython from pyrasterframes.utils import create_rf_spark_session from pyspar...

Machine Learning

Reply

1634 Views
0 replies
3 kudos

07-27-2022 6:23:08 AM

by MadelynM • Databricks Employee

11-08-2021 11:39:14 AM

1950 Views
0 replies
2 kudos

2021-08-Best-Practices-for-Your-Data-Architecture-v3-OG-1200x628

Thanks to everyone who joined the Best Practices for Your Data Architecture session on Optimizing Data Performance. You can access the on-demand session recording here and the pre-run performance benchmarks using the Spark UI Simulator. Proper cluste...

Machine Learning

Reply

1950 Views
0 replies
2 kudos

11-08-2021 11:39:14 AM

by eyalwir • New Contributor

07-22-2021 7:08:40 PM

1172 Views
0 replies
0 kudos

Deep Learning on Spark within AWS EMR

I'd like to use Deep Learning on Spark within AWS EMR.Is there a best practice or 'recommended' DL framework to run on Spark? It looks like Databricks' spark-deep-learning has been replaced by Horovod—should this the first option to consider? If th...

Machine Learning

Reply

1172 Views
0 replies
0 kudos

07-22-2021 7:08:40 PM

by Srikanth_Gupta_ • Databricks Employee

06-16-2021 5:49:23 AM

2251 Views
1 replies
1 kudos

What are best NLP libraries to use with Spark

Best NLP APIs to use with Spark which gives better performance

Machine Learning

Reply

2251 Views
1 replies
1 kudos

06-16-2021 5:49:23 AM

View Replies

Latest Reply

sean_owen
Databricks Employee

06-17-2021 12:59:25 PM

1 kudos

By far the most popular and comprehensive library, to my knowledge, for Spark-native distributed NLP, is spark-nlp from John Snow Labs. https://nlp.johnsnowlabs.com/ It is open source (but with commercial support options) and has a whole lot of funct...

1 kudos

06-17-2021 12:59:25 PM

Databricks Community

Forum Posts

Spark connector to mongodb - mongo-spark-connector_2.12:10.1.1

Expose low latency APIs from Deltalake for mobile apps and microservices

Runtime error using MLFlow and Spark on databricks

How to use spark-submit python task with the usage of --archives parameter passing a .tar.gz conda env?

Resolved! What are the best resources for learning how to tune/optimize Spark?

How to circumvent Py4JSecurityException for spark-nlp : Constructor public com.johnsnowlabs.nlp.***(java.lang.String) is not whitelisted.

Resolved! Spark Error/Exception Handling

Expose low latency APIs from Deltalake for mobile apps and microservices

Cannot serialize this model error when attempting MLFlow for SparkNLP

Java Error for installation rasterframes

2021-08-Best-Practices-for-Your-Data-Architecture-v3-OG-1200x628

Deep Learning on Spark within AWS EMR

What are best NLP libraries to use with Spark