cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

celerity12
by New Contributor II
  • 7689 Views
  • 7 replies
  • 4 kudos

Pulling list of running jobs using JOBS API 2.1

I need to find out all jobs which are currently running and not get other jobsThe below command fetches all the jobscurl --location --request GET 'https://xxxxxx.gcp.databricks.com/api/2.1/jobs/list?active_only=true&expand_tasks=true&run_type=JOB_RUN...

  • 7689 Views
  • 7 replies
  • 4 kudos
Latest Reply
User16764241763
Databricks Employee
  • 4 kudos

Hi @Sumit Rohatgi​ It seems like active_only=true only applies to jobs/runs/list API and not to jobs/list.Can you please try the jobs/runs/list API?

  • 4 kudos
6 More Replies
C_1
by New Contributor III
  • 5679 Views
  • 5 replies
  • 4 kudos

Resolved! Databricks notebook command logging

Hello Community,I am trying to search for Databricks notebook command logging feature for compliance purpose.My requirement is to log the exact spark sql fired by user.I didnt get spark sql (notebook command) tracked under this azure diagnostic logs....

  • 5679 Views
  • 5 replies
  • 4 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 4 kudos

Hi @C P​ we don't have this feature implemented, however, there is already an existing idea available in our idea portal here: https://databricks.aha.io/features/DB-7583.You can check and vote the same.

  • 4 kudos
4 More Replies
CHANDY
by New Contributor
  • 1763 Views
  • 1 replies
  • 0 kudos

Real Time data processing

Say I am getting a customer record from an website. I want to read the massage & then insert/update that one to snowflake table , depending on the records insert/update is successful I need to respond back the success / failure massage in say 1 sec. ...

  • 1763 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hey @CHANDAN NANDY​ Just checking in with you.Does @Kaniz Fatma​'s answer help? If it does, would you be happy to mark it as best? If it doesn't, please tell us so we can help you further.Thanks!

  • 0 kudos
JohnB
by New Contributor II
  • 3695 Views
  • 1 replies
  • 1 kudos

Are there implications moving Managed Table, and mounting as External.

The scenario is "A substaincial amount of data needs to be moved from a legacy Databricks that has Managed Tables, to a new E2 Databrick. The new bucket will be a dedicated Datalake rather than the Workspace Bucket so they will be External Tables."U...

  • 3695 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hey there @John Brandborg​ Hope everything is going great! Just wanted to check in if you were able to resolve your issue would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to ...

  • 1 kudos
Ravi96
by New Contributor II
  • 5147 Views
  • 4 replies
  • 5 kudos

How can we sort the timeout issue in Databricks

we are creating a denorm table based on a JSON ingestion but the complex table is getting generated .when we try to deflatten the JSON rows it is taking for more than 5 hours and the error message is timeout erroris there any way that we could resolv...

  • 5147 Views
  • 4 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Hey @Raviteja Paluri​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. Thanks!

  • 5 kudos
3 More Replies
ta_db
by New Contributor
  • 2160 Views
  • 1 replies
  • 0 kudos

Databricks SQL Endpoint Failing to create an external table on a parquet file with Decimal or Timestamp datatype

I'm using the Databricks SQL Endpoint and I'm attempting to create an external table on top of an existing parquet file. I can do this so long as my table definition does not include a reference to a decimal or timestamp/date datatype.ex. This worksC...

  • 2160 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hey there @T A​ Hope everything is going great!Does @Kaniz Fatma​'s response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly? If not, would you be happy to give us more info...

  • 0 kudos
577391
by New Contributor II
  • 3639 Views
  • 2 replies
  • 0 kudos

Resolved! How do I merge two tables and track changes to missing rows as well as new rows

In my scenario, the new data coming in are the current, valid records. Any records that are not in the new data should be labeled as 'Gone", any matching records should be labeled with "Updated". And finally, any new records should be added.So in sum...

  • 3639 Views
  • 2 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

Detection deletions does not work out of the box.The merge statement will evaluate the incoming data against the existing data. It will not check the existing data against the incoming data.To mark deletions, you will have to specifically update tho...

  • 0 kudos
1 More Replies
ivanychev
by Contributor II
  • 1748 Views
  • 0 replies
  • 1 kudos

How to enable remote JMX monitoring in Databricks?

Adding these optionsEXTRA_JAVA_OPTIONS = ( '-Dcom.sun.management.jmxremote.port=9999', '-Dcom.sun.management.jmxremote.authenticate=false', '-Dcom.sun.management.jmxremote.ssl=false', )is enough in vanilla Apache Spark, but apparently it ...

  • 1748 Views
  • 0 replies
  • 1 kudos
Sree_Patllola
by New Contributor
  • 1824 Views
  • 0 replies
  • 0 kudos

I am in a process of Connecting to X vendor and pull back the data needed from that X vendor.

For that we have shared our Azure IP addres (NO VPN or Corporate IP address Available as of now - still initial stages of the project) with X vendor, which is whitelisted now. Now I am trying to setup the X vendor API in the databricks to lookup into...

  • 1824 Views
  • 0 replies
  • 0 kudos
jay548
by New Contributor
  • 1631 Views
  • 0 replies
  • 0 kudos

ERROR yarn.ApplicationMaster: - Wrong FS s3:// expected s3a://

We migrated from HDP to Cloudera platform 7, everything works except when we try to use databricks with redshift to load the data into a redshift table. we get the following error . ERROR yarn.ApplicationMaster: User class threw exception: java.lang....

  • 1631 Views
  • 0 replies
  • 0 kudos
AlexDavies
by Contributor
  • 11029 Views
  • 7 replies
  • 2 kudos

Resolved! How to upgrade internal hive metadata store version

Is it possible to upgrade the out of the box hive metastore version? running spark.conf.get("spark.sql.hive.metastore.version") indicates that it is running on 0.13.0 However https://docs.microsoft.com/en-us/azure/databricks/release-notes/runtime/7....

  • 11029 Views
  • 7 replies
  • 2 kudos
Latest Reply
pantelis_mare
Contributor III
  • 2 kudos

Hello guys!Atanu's post, although correct does not solve the problem. Is there any official documentation on how to upgrade the internal databricks metastore to a greater version? If this is availble then we can try Atanu's solution (not sure if need...

  • 2 kudos
6 More Replies
159312
by New Contributor III
  • 3539 Views
  • 1 replies
  • 1 kudos

Resolved! How to get autoloader to load files in order

I'm new to spark and Databricks and I'm trying to write a pipeline to take CDC data from a postgres database stored in s3 and ingest it. The file names are numerically ascending unique ids based on datatime (ie20220630-215325970.csv). Right now auto...

  • 3539 Views
  • 1 replies
  • 1 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 1 kudos

Hi @Ben Bogart​ For lexicographically generated files, Auto Loader can leverage the lexical file ordering and optimized listing APIs. For more info on lexical ordering please go through the below link: https://docs.databricks.com/ingestion/auto-loade...

  • 1 kudos
fs
by New Contributor III
  • 13653 Views
  • 12 replies
  • 9 kudos

Resolved! how to access data objects from different languages [R/SQL/Spark/Python]

Hi sorry new to Spark, DataBricks. Please could someone summarise options for moving data between these different languages. Esp. interested in R<=>Python options: can see how to do SQL/Spark. Spent a lot of time googling but no result. Presume can u...

  • 13653 Views
  • 12 replies
  • 9 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 9 kudos

@Fernley Symons​ Thank you for your prompt reply. Apologies, we have just noticed that an answer is already marked as best. Thank you once again.

  • 9 kudos
11 More Replies
Andy_EU
by New Contributor
  • 1779 Views
  • 2 replies
  • 0 kudos

How do you do if/then statements in Delta Line Pipelines?

How do you do if/then statements in Python based Delta Line Pipelines? I'm essentially looking for the Python way of doing CASE statements.

  • 1779 Views
  • 2 replies
  • 0 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 0 kudos

Hi @Andy Pandy​ I hope that the answer provided by @Jose Gonzalez​ would have helped in resolving your query. Please let us know if you have more doubts or queries still.

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels