cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

577391
by New Contributor II
  • 2365 Views
  • 2 replies
  • 0 kudos

Resolved! How do I merge two tables and track changes to missing rows as well as new rows

In my scenario, the new data coming in are the current, valid records. Any records that are not in the new data should be labeled as 'Gone", any matching records should be labeled with "Updated". And finally, any new records should be added.So in sum...

  • 2365 Views
  • 2 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

Detection deletions does not work out of the box.The merge statement will evaluate the incoming data against the existing data. It will not check the existing data against the incoming data.To mark deletions, you will have to specifically update tho...

  • 0 kudos
1 More Replies
ivanychev
by Contributor II
  • 1276 Views
  • 0 replies
  • 1 kudos

How to enable remote JMX monitoring in Databricks?

Adding these optionsEXTRA_JAVA_OPTIONS = ( '-Dcom.sun.management.jmxremote.port=9999', '-Dcom.sun.management.jmxremote.authenticate=false', '-Dcom.sun.management.jmxremote.ssl=false', )is enough in vanilla Apache Spark, but apparently it ...

  • 1276 Views
  • 0 replies
  • 1 kudos
Sree_Patllola
by New Contributor
  • 1477 Views
  • 0 replies
  • 0 kudos

I am in a process of Connecting to X vendor and pull back the data needed from that X vendor.

For that we have shared our Azure IP addres (NO VPN or Corporate IP address Available as of now - still initial stages of the project) with X vendor, which is whitelisted now. Now I am trying to setup the X vendor API in the databricks to lookup into...

  • 1477 Views
  • 0 replies
  • 0 kudos
jay548
by New Contributor
  • 1142 Views
  • 0 replies
  • 0 kudos

ERROR yarn.ApplicationMaster: - Wrong FS s3:// expected s3a://

We migrated from HDP to Cloudera platform 7, everything works except when we try to use databricks with redshift to load the data into a redshift table. we get the following error . ERROR yarn.ApplicationMaster: User class threw exception: java.lang....

  • 1142 Views
  • 0 replies
  • 0 kudos
AlexDavies
by Contributor
  • 8905 Views
  • 7 replies
  • 2 kudos

Resolved! How to upgrade internal hive metadata store version

Is it possible to upgrade the out of the box hive metastore version? running spark.conf.get("spark.sql.hive.metastore.version") indicates that it is running on 0.13.0 However https://docs.microsoft.com/en-us/azure/databricks/release-notes/runtime/7....

  • 8905 Views
  • 7 replies
  • 2 kudos
Latest Reply
pantelis_mare
Contributor III
  • 2 kudos

Hello guys!Atanu's post, although correct does not solve the problem. Is there any official documentation on how to upgrade the internal databricks metastore to a greater version? If this is availble then we can try Atanu's solution (not sure if need...

  • 2 kudos
6 More Replies
159312
by New Contributor III
  • 2437 Views
  • 1 replies
  • 1 kudos

Resolved! How to get autoloader to load files in order

I'm new to spark and Databricks and I'm trying to write a pipeline to take CDC data from a postgres database stored in s3 and ingest it. The file names are numerically ascending unique ids based on datatime (ie20220630-215325970.csv). Right now auto...

  • 2437 Views
  • 1 replies
  • 1 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 1 kudos

Hi @Ben Bogart​ For lexicographically generated files, Auto Loader can leverage the lexical file ordering and optimized listing APIs. For more info on lexical ordering please go through the below link: https://docs.databricks.com/ingestion/auto-loade...

  • 1 kudos
fs
by New Contributor III
  • 9346 Views
  • 12 replies
  • 9 kudos

Resolved! how to access data objects from different languages [R/SQL/Spark/Python]

Hi sorry new to Spark, DataBricks. Please could someone summarise options for moving data between these different languages. Esp. interested in R<=>Python options: can see how to do SQL/Spark. Spent a lot of time googling but no result. Presume can u...

  • 9346 Views
  • 12 replies
  • 9 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 9 kudos

@Fernley Symons​ Thank you for your prompt reply. Apologies, we have just noticed that an answer is already marked as best. Thank you once again.

  • 9 kudos
11 More Replies
Andy_EU
by New Contributor
  • 1236 Views
  • 2 replies
  • 0 kudos

How do you do if/then statements in Delta Line Pipelines?

How do you do if/then statements in Python based Delta Line Pipelines? I'm essentially looking for the Python way of doing CASE statements.

  • 1236 Views
  • 2 replies
  • 0 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 0 kudos

Hi @Andy Pandy​ I hope that the answer provided by @Jose Gonzalez​ would have helped in resolving your query. Please let us know if you have more doubts or queries still.

  • 0 kudos
1 More Replies
User16783853906
by Contributor III
  • 3321 Views
  • 5 replies
  • 5 kudos

Resolved! Update code for a streaming job in Production

How to update a streaming job in production with minimal/no downtime when there are significant code changes that may not be compatible with the existing checkpoint state to resume the stream processing?

  • 3321 Views
  • 5 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Thanks for the information, I will try to figure it out for more. Keep sharing such informative post keep suggesting such post.MA Health Connector

  • 5 kudos
4 More Replies
vk217
by Contributor
  • 5009 Views
  • 3 replies
  • 1 kudos

Resolved! ERROR: No matching distribution found for databricks-connect==7.3.34

Previously, our databricks-connect was using 7.3.34 and the builds in pipenv and the builds were successful. As of today the builds are failing with error that the version 7.3.34 no longer exists.Is there a reason this version is no longer supported....

  • 5009 Views
  • 3 replies
  • 1 kudos
Latest Reply
Atanu
Databricks Employee
  • 1 kudos

Hello @Vikas B​ this is the release note -https://docs.databricks.com/release-notes/dbconnect/index.htmlalso,Only the following Databricks Runtime versions are supported:Databricks Runtime 10.4 LTS ML, Databricks Runtime 10.4 LTSDatabricks Runtime 9....

  • 1 kudos
2 More Replies
sohamdhodapkar
by New Contributor
  • 1390 Views
  • 3 replies
  • 3 kudos
  • 1390 Views
  • 3 replies
  • 3 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 3 kudos

Hi @Soham Dhodapkar​ https://docs.databricks.com/lakehouse/index.html This document depicts the component of the lakehouse as described in the image shared by @Hubert Dudek​ .

  • 3 kudos
2 More Replies
codevisionz
by New Contributor
  • 543 Views
  • 0 replies
  • 0 kudos

Our Python Code Examples covers basic concepts, control structures, functions, lists, classes, objects, inheritance, polymorphism, file operations, da...

Our Python Code Examples covers basic concepts, control structures, functions, lists, classes, objects, inheritance, polymorphism, file operations, data structures, sorting algorithms, mathematical functions, mathematical sequences, threads, exceptio...

  • 543 Views
  • 0 replies
  • 0 kudos
Taha_Hussain
by Databricks Employee
  • 1169 Views
  • 0 replies
  • 8 kudos

Databricks Office Hours Register for Office Hours to participate in a LIVE Q&A session and receive technical support directly from Databricks expe...

Databricks Office HoursRegister for Office Hours to participate in a LIVE Q&A session and receive technical support directly from Databricks experts! Our next event is scheduled for July 27th from 8:00am - 9:00am PT | 3:00pm - 4:00pm GMT.Whether you ...

  • 1169 Views
  • 0 replies
  • 8 kudos
EveryDayData
by Contributor
  • 1699 Views
  • 1 replies
  • 1 kudos

MergeSchema on Delta Streaming

Hi Guys,Quick thing can we do MergeSchema on update mode in streaming or it is overwrite schema while using update mode . 

  • 1699 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hey @Shikher Singh​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 1 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels