cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Abel_Martinez
by Contributor
  • 13394 Views
  • 9 replies
  • 10 kudos

Resolved! Why I'm getting connection timeout when connecting to MongoDB using MongoDB Connector for Spark 10.x from Databricks

I'm able to connect to MongoDB using org.mongodb.spark:mongo-spark-connector_2.12:3.0.2 and this code:df = spark.read.format("com.mongodb.spark.sql.DefaultSource").option("uri", jdbcUrl)It works well, but if I install last MongoDB Spark Connector ve...

  • 13394 Views
  • 9 replies
  • 10 kudos
Latest Reply
ravisharma1024
New Contributor II
  • 10 kudos

I was facing the same issue, now It is resolved, and thanks to @Abel_Martinez.I am using this like below code:df = spark.read.format("mongodb") \.option('spark.mongodb.read.connection.uri', "mongodb+srv://*****:*****@******/?retryWrites=true&w=majori...

  • 10 kudos
8 More Replies
sharonbjehome
by New Contributor
  • 1456 Views
  • 1 replies
  • 1 kudos

Structered Streamin from MongoDB Atlas not parsing JSON correctly

HI all,I have a table in MongoDB Atlas that I am trying to read continuously to memory and then will write that file out eventually. However, when I look at the in-memory table it doesn't have the correct schema.Code here:from pyspark.sql.types impo...

image.png
  • 1456 Views
  • 1 replies
  • 1 kudos
Latest Reply
Debayan
Databricks Employee
  • 1 kudos

Hi @sharonbjehome​ , This has to be checked thoroughly via a support ticket, did you follow: https://docs.databricks.com/external-data/mongodb.html Also, could you please check with mongodb support, Was this working before?

  • 1 kudos
amichel
by New Contributor III
  • 7440 Views
  • 3 replies
  • 4 kudos

Resolved! Recommended way to integrate MongoDB as a streaming source

Current state:Data is stored in MongoDB Atlas which is used extensively by all servicesData lake is hosted in same AWS region and connected to MongoDB over private link Requirements:Streaming pipelines that continuously ingest, transform/analyze and ...

  • 7440 Views
  • 3 replies
  • 4 kudos
Latest Reply
robwma
New Contributor III
  • 4 kudos

Another option if you'd like to use Spark as the ingestion is to use the new Spark Connector V10.0 which support Spark Structured Streaming. https://www.mongodb.com/developer/languages/python/streaming-data-apache-spark-mongodb/. If you use Kafka, th...

  • 4 kudos
2 More Replies
Mr__E
by Contributor II
  • 2847 Views
  • 3 replies
  • 3 kudos

Resolved! Importing MongoDB with field names containing spaces

I am currently using a Python notebook with a defined schema to import fairly unstructured documents in MongoDB. Some of these documents have spaces in their field names. I define the schema for the MongoDB PySpark connector like the following:Struct...

  • 2847 Views
  • 3 replies
  • 3 kudos
Latest Reply
Mr__E
Contributor II
  • 3 kudos

Solution: It turns out the issue is not the schema reading in, but the fact that I am writing to Delta tables, which do not currently support spaces. So, I need to transform them prior to dumping. I've been following a pattern of reading in raw data,...

  • 3 kudos
2 More Replies
Labels