cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

HariharaSam
by Contributor
  • 23113 Views
  • 8 replies
  • 4 kudos

Resolved! To get Number of rows inserted after performing an Insert operation into a table

Consider we have two tables A & B.qry = """INSERT INTO Table ASelect * from Table B where Id is null """spark.sql(qry)I need to get the number of records inserted after running this in databricks.

  • 23113 Views
  • 8 replies
  • 4 kudos
Latest Reply
GRCL
New Contributor III
  • 4 kudos

Almost same advice than Hubert, I use the history of the delta table :df_history.select(F.col('operationMetrics')).collect()[0].operationMetrics['numOutputRows']You can find also other 'operationMetrics' values, like 'numTargetRowsDeleted'.

  • 4 kudos
7 More Replies
Merchiv
by New Contributor III
  • 10327 Views
  • 8 replies
  • 2 kudos

Resolved! AnalysisException when running SQL queries

When running some SQL queries using spark.sql(...), we sometimes get a variant of the following error:AnalysisException: Undefined function: current_timestamp. This function is neither a built-in/temporary function, nor a persistent function that is ...

  • 10327 Views
  • 8 replies
  • 2 kudos
Latest Reply
ashish1
New Contributor III
  • 2 kudos

This is most likely a conflict in the lib code, you can uninstall some libs on your cluster and try to narrow it down to the problematic one.

  • 2 kudos
7 More Replies
siddharthk
by New Contributor II
  • 1479 Views
  • 2 replies
  • 2 kudos

Resolved! Reduce downtime of Postgres table - JDBC overwrite job

I want to overwrite a Postgresql table transactionStats which is used by the customer facing dashboards.This table needs to be updated every 30 mins. I am writing a AWS Glue Spark job via JDBC connection to perform this operation.Spark dataframe writ...

  • 1479 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Siddharth Kanojiya​ We haven't heard from you since the last response from @werners (Customer)​ . Kindly share the information with us, and in return, we will provide you with the necessary solution.Thanks and Regards

  • 2 kudos
1 More Replies
Pras1
by New Contributor II
  • 7517 Views
  • 2 replies
  • 2 kudos

Resolved! AZURE_QUOTA_EXCEEDED_EXCEPTION - even with more than vCPUs than Databricks recommends

I am running this Delta Live Tables PoC from databricks-industry-solutions/industry-solutions-blueprintshttps://github.com/databricks-industry-solutions/pos-dltI have Standard_DS4_v2 with 28GB and 8 cores x 2 workers - so a total of 16 cores. This is...

  • 7517 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Prasenjit Biswas​ We haven't heard from you since the last response from @Jose Gonzalez​ â€‹ . Kindly share the information with us, and in return, we will provide you with the necessary solution.Thanks and Regards

  • 2 kudos
1 More Replies
Woody
by New Contributor II
  • 1065 Views
  • 1 replies
  • 2 kudos

This site has alot of issues....

I need to login and I can't to try Databricks..so you have a OAUTH issue...I cant try Databricks at all because the country icon doesn't work and it sends a URI issue from your front end to the back end..."Request_URI=&Geo_country_code=&Geo_country_i...

  • 1065 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Jessica Woods​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers...

  • 2 kudos
Westinghouse
by New Contributor II
  • 6273 Views
  • 7 replies
  • 3 kudos

Detectron2 install

I have been struggling to install Detectron 2. I think it is an issue with Cuda. Any adivise?install!pip install -q "detectron2@git+https://github.com/facebookresearch/detectron2.git@e2ce8dc#egg=detectron2"Error:qm8vsdal/detectron2_85faeed5ce7945dbad...

  • 6273 Views
  • 7 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Joshua Roberge​Sorry for the inconvenience!Kindly review the solution offered by @Suteja Kanuri.

  • 3 kudos
6 More Replies
ranged_coop
by Valued Contributor II
  • 18910 Views
  • 21 replies
  • 21 kudos

Resolved! How to access a jar file stored in Databricks Workspace ?

Hi All,We have a couple of jars stored in a workspace folder.We are using init scripts to copy the jars in the workspace to the /databricks/jars path.The init scripts do not seem to be able to find the files. The scripts are failing saying the files ...

  • 18910 Views
  • 21 replies
  • 21 kudos
Latest Reply
Anonymous
Not applicable
  • 21 kudos

Hi @Bharath Kumar Ramachandran​ You're welcome! I'm glad you found the link useful. I empathize with your hope that Databricks would consider adding this option. It's possible that Databricks will take user feedback into account when planning future ...

  • 21 kudos
20 More Replies
sriram_kumar
by New Contributor II
  • 2241 Views
  • 4 replies
  • 5 kudos

To do Optimization on the real time delta table

Hi Team,We have few prod tables which are created in s3 bucket, that have grown now very large, these tables are getting real time data continuously from round the clock databricks workflows; we would like run the optimization commands(optimize, zord...

  • 2241 Views
  • 4 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Hi @Sriram Kumar​ We haven't heard from you since the last response from @Suteja Kanuri​ â€‹ . Kindly share the information with us, and in return, we will provide you with the necessary solution.Thanks and Regards

  • 5 kudos
3 More Replies
jole3112
by New Contributor III
  • 7801 Views
  • 7 replies
  • 9 kudos

virtual environment on azure databricks compute cluster

I'm using Azure Databricks and I'd like to create a project virtual environment, persisted on a shared compute cluster. As the cluster is shared for many projects, it is necessary to have virtual environments if I want to execute code runs from withi...

  • 7801 Views
  • 7 replies
  • 9 kudos
Latest Reply
Anonymous
Not applicable
  • 9 kudos

Hi @Joshua L​ We haven't heard from you since the last response from @Debayan Mukherjee​ â€‹, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to ot...

  • 9 kudos
6 More Replies
Matt1209
by New Contributor II
  • 1207 Views
  • 1 replies
  • 3 kudos

How to execute requests later for a number of times that exceeds the Maximum concurrent runs?

I am trying to start the same Jobs multiple times using the python sdk's "run_now" command.If the number of requests exceeds the Maximum concurrent runs, the status of the run will be Skipped and the run will not be executed.Is there any way to queue...

  • 1207 Views
  • 1 replies
  • 3 kudos
Latest Reply
Debayan
Databricks Employee
  • 3 kudos

Hi, We do have a private preview feature which will be enabled shortly for queueing. Please tag me (@Debayan Mukherjee​ ) with your next update so that I will get notified.

  • 3 kudos
sevvalmehder
by New Contributor II
  • 2167 Views
  • 3 replies
  • 3 kudos

Databricks run-time 12.2 LTS drop function problem

I am getting an error about the `drop function of pyspark` at a cluster using 12.2 LTS. When I check the error I see spark solved that bug, see SPARK-42444. Also when I check maintenance updates page, I saw this solved issue included the Databricks R...

image.png
  • 2167 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Sevval Mehder​ Elevate our community by acknowledging exceptional contributions. Your participation in marking the best answers is a testament to our collective pursuit of knowledge.

  • 3 kudos
2 More Replies
RamdasP
by New Contributor
  • 1526 Views
  • 2 replies
  • 3 kudos

Resolved! Implement & Test DR Plan

Hi,Can you direct me to any documentation on how to implement and test Disaster Recovery for Databricks (PAAS) on Azure ?Thx & RgdsRamdas

  • 1526 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Ramdas Panicher​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answe...

  • 3 kudos
1 More Replies
ayush1900
by New Contributor II
  • 1362 Views
  • 1 replies
  • 1 kudos
  • 1362 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Ayush Raj​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...

  • 1 kudos
reachbharathan
by New Contributor III
  • 2826 Views
  • 3 replies
  • 4 kudos

Resolved! How to checkout specific commit version via databricks UI

I have integrated gitlab with my azure databricks repo, I am able to push and pull commits from the databricks UI, I want to checkout to a specific commit version via databricks UI.Note: I am aware that via the gitlab i have checkout to specific vers...

  • 2826 Views
  • 3 replies
  • 4 kudos
Latest Reply
reachbharathan
New Contributor III
  • 4 kudos

After getting more context on databricks repo in details,Currently databricks doesn't support checkout of repo to specific commit.databricks provides only limited functionality mentioned belowAdd a repo and connect remotely laterClone a repo connecte...

  • 4 kudos
2 More Replies
fhmessas
by New Contributor II
  • 1784 Views
  • 2 replies
  • 2 kudos

Trigger.AvailableNow getting stuck when there is no event

Hi, I have several streaming jobs, however one of them uses the Trigger.AvailableNow. The issue is that it gets stuck when there is no events or finishes ingesting all events. The expected behavior would be the job being shutdown.I've already checked...

Stuck streaming
  • 1784 Views
  • 2 replies
  • 2 kudos
Latest Reply
fhmessas
New Contributor II
  • 2 kudos

Hi, the source is an S3 bucket using file notification with SQS.No errors or warns in the logs, the AvailableNow trigger just gets stuck.

  • 2 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels