Data Engineering

Forum Posts

Sorted by:

by Abel_Martinez • Contributor

12-23-2022 7:44:50 AM

16467 Views
9 replies
10 kudos

Resolved! Why I'm getting connection timeout when connecting to MongoDB using MongoDB Connector for Spark 10.x from Databricks

I'm able to connect to MongoDB using org.mongodb.spark:mongo-spark-connector_2.12:3.0.2 and this code:df = spark.read.format("com.mongodb.spark.sql.DefaultSource").option("uri", jdbcUrl)It works well, but if I install last MongoDB Spark Connector ve...

Data Engineering

16467 Views
9 replies
10 kudos

12-23-2022 7:44:50 AM

View Replies

Latest Reply

ravisharma1024
New Contributor II

10-23-2024 5:20:57 AM

10 kudos

I was facing the same issue, now It is resolved, and thanks to @Abel_Martinez.I am using this like below code:df = spark.read.format("mongodb") \.option('spark.mongodb.read.connection.uri', "mongodb+srv://*****:*****@******/?retryWrites=true&w=majori...

10 kudos

10-23-2024 5:20:57 AM

8 More Replies

by sharonbjehome • New Contributor

11-16-2022 4:17:29 AM

1720 Views
1 replies
1 kudos

Structered Streamin from MongoDB Atlas not parsing JSON correctly

HI all,I have a table in MongoDB Atlas that I am trying to read continuously to memory and then will write that file out eventually. However, when I look at the in-memory table it doesn't have the correct schema.Code here:from pyspark.sql.types impo...

Data Engineering

1720 Views
1 replies
1 kudos

11-16-2022 4:17:29 AM

View Replies

Latest Reply

Debayan
Databricks Employee

11-17-2022 11:36:04 PM

1 kudos

Hi @sharonbjehome , This has to be checked thoroughly via a support ticket, did you follow: https://docs.databricks.com/external-data/mongodb.html Also, could you please check with mongodb support, Was this working before?

1 kudos

11-17-2022 11:36:04 PM

by amichel • New Contributor III

02-22-2022 11:40:24 AM

8410 Views
3 replies
4 kudos

Resolved! Recommended way to integrate MongoDB as a streaming source

Current state:Data is stored in MongoDB Atlas which is used extensively by all servicesData lake is hosted in same AWS region and connected to MongoDB over private link Requirements:Streaming pipelines that continuously ingest, transform/analyze and ...

Data Engineering

8410 Views
3 replies
4 kudos

02-22-2022 11:40:24 AM

View Replies

Latest Reply

robwma
New Contributor III

06-21-2022 10:44:54 AM

4 kudos

Another option if you'd like to use Spark as the ingestion is to use the new Spark Connector V10.0 which support Spark Structured Streaming. https://www.mongodb.com/developer/languages/python/streaming-data-apache-spark-mongodb/. If you use Kafka, th...

4 kudos

06-21-2022 10:44:54 AM

2 More Replies

by Mr__E • Contributor II

02-15-2022 3:49:45 PM

3480 Views
3 replies
3 kudos

Resolved! Importing MongoDB with field names containing spaces

I am currently using a Python notebook with a defined schema to import fairly unstructured documents in MongoDB. Some of these documents have spaces in their field names. I define the schema for the MongoDB PySpark connector like the following:Struct...

Data Engineering

3480 Views
3 replies
3 kudos

02-15-2022 3:49:45 PM

View Replies

Latest Reply

Mr__E
Contributor II

02-15-2022 9:29:42 PM

3 kudos

Solution: It turns out the issue is not the schema reading in, but the fact that I am writing to Delta tables, which do not currently support spaces. So, I need to transform them prior to dumping. I've been following a pattern of reading in raw data,...

3 kudos

02-15-2022 9:29:42 PM

2 More Replies

Databricks Community

Resolved! Why I'm getting connection timeout when connecting to MongoDB using MongoDB Connector for Spark 10.x from Databricks

Structered Streamin from MongoDB Atlas not parsing JSON correctly

Resolved! Recommended way to integrate MongoDB as a streaming source

Resolved! Importing MongoDB with field names containing spaces