cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Jyo777
by Contributor
  • 5049 Views
  • 7 replies
  • 4 kudos

need help with Azure Databricks questions on CTE and SQL syntax within notebooks

Hi amazing community folks,Feel free to share your experience or knowledge regarding below questions:-1.) Can we pass a CTE sql statement into spark jdbc? i tried to do it i couldn't but i can pass normal sql (Select * from ) and it works. i heard th...

  • 5049 Views
  • 7 replies
  • 4 kudos
Latest Reply
Rjdudley
New Contributor III
  • 4 kudos

Not a comparison, but there is a DB-SQL cheatsheet at https://www.databricks.com/sites/default/files/2023-09/databricks-sql-cheatsheet.pdf/

  • 4 kudos
6 More Replies
183530
by New Contributor III
  • 1303 Views
  • 2 replies
  • 2 kudos

How to search an array of words in a text field

Example:TABLE 1FIELD_TEXTI like salty food and Italian foodI have Italian foodbread, rice and beansmexican foodscoke, spritearray['italia', 'mex','coke']match TABLE1 X ARRAYResults:I like salty food and Italian foodI have Italian foodmexican foodsis ...

  • 1303 Views
  • 2 replies
  • 2 kudos
Latest Reply
Meredithharper
New Contributor
  • 2 kudos

Yes, you can do it in SQL with LIKE or IN and in PySpark using array contains, ideal for filtering Words like halal catering Barcelona, catering, and many more

  • 2 kudos
1 More Replies
Sen
by New Contributor
  • 8176 Views
  • 9 replies
  • 1 kudos

Resolved! Performance enhancement while writing dataframes into Parquet tables

Hi,I am trying to write the contents of a dataframe into a parquet table using the command below.df.write.mode("overwrite").format("parquet").saveAsTable("sample_parquet_table")The dataframe contains an extract from one of our source systems, which h...

  • 8176 Views
  • 9 replies
  • 1 kudos
Latest Reply
jhoon
New Contributor
  • 1 kudos

Great discussion on performance optimization! Managing technical projects like these alongside academic work can be demanding. If you need expert academic support to free up time for your professional pursuits, Dissertation Help Services is here to a...

  • 1 kudos
8 More Replies
dimsh
by Contributor
  • 13810 Views
  • 13 replies
  • 10 kudos

How to overcome missing query parameters in Databricks SQL?

Hi, there! I'm trying to build up my first dashboard based on Dataabricks SQL. As far as I can see if you define a query parameter you can't skip it further. I'm looking for any option where I can make my parameter optional. For instance, I have a ta...

  • 13810 Views
  • 13 replies
  • 10 kudos
Latest Reply
techg
New Contributor II
  • 10 kudos

Is there any solution for the above mentioned post?

  • 10 kudos
12 More Replies
noimeta
by Contributor III
  • 4777 Views
  • 9 replies
  • 4 kudos

Resolved! Databricks SQL: catalog of each query

Currently, we are migrating from hive metastore to UC. We have several dashboards and a huge number of queries whose catalogs have been set to hive_metastore and using <db>.<table> access pattern.I'm just wondering if there's a way to switch catalogs...

  • 4777 Views
  • 9 replies
  • 4 kudos
Latest Reply
h_h_ak
Contributor
  • 4 kudos

May you can also have a look here, if you need hot fix  https://github.com/databrickslabs/ucx 

  • 4 kudos
8 More Replies
swzzzsw
by New Contributor III
  • 4477 Views
  • 6 replies
  • 0 kudos

Resolved! SQLServerException: deadlock

I'm using databricks to connect to a SQL managed instance via JDBC. SQL operations I need to perform include DELETE, UPDATE, and simple read and write. Since spark syntax only handles simple read and write, I had to open SQL connection using Scala an...

image.png
  • 4477 Views
  • 6 replies
  • 0 kudos
Latest Reply
Panda
Valued Contributor
  • 0 kudos

@swzzzsw Since you are performing database operations, to reduce the chances of deadlocks, make sure to wrap your SQL operations inside transactions using commit and rollback.Another approachs to consider is adding retry logic or using Isolation Leve...

  • 0 kudos
5 More Replies
elgeo
by Valued Contributor II
  • 26987 Views
  • 5 replies
  • 2 kudos

SQL Stored Procedure in Databricks

Hello. Is there an equivalent of SQL stored procedure in Databricks? Please note that I need a procedure that allows DML statements and not only Select statement as a function provides.Thank you in advance

  • 26987 Views
  • 5 replies
  • 2 kudos
Latest Reply
Biswajit
New Contributor II
  • 2 kudos

I have recently went through a video link from databricks which says its possible. But, when I tried it, it did not worked.https://www.youtube.com/watch?v=f4TxNBfSNqMWas anyone able to create stored procedure in databricks ?

  • 2 kudos
4 More Replies
Ericsson
by New Contributor II
  • 2910 Views
  • 3 replies
  • 1 kudos

SQL week format issue its not showing result as 01(ww)

Hi Folks,I've requirement to show the week number as ww format. Please see the below codeselect weekofyear(date_add(to_date(current_date, 'yyyyMMdd'), +35)). also plz refre the screen shot for result.

result
  • 2910 Views
  • 3 replies
  • 1 kudos
Latest Reply
Feltonrolfson
New Contributor II
  • 1 kudos

It seems you're encountering an issue with SQL week formatting, where the results aren't displaying as expected (01(ww)). This could impact data analysis. monkey mart

  • 1 kudos
2 More Replies
AlexDavies
by Contributor
  • 7501 Views
  • 9 replies
  • 2 kudos

Report on SQL queries that are being executed

We have a SQL workspace with a cluster running that services a number of self service reports against a range of datasets. We want to be able to analyse and report on the queries our self service users are executing so we can get better visibility of...

  • 7501 Views
  • 9 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hey there @Alex Davies​ Hope you are doing great. Just checking in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

  • 2 kudos
8 More Replies
Graham
by New Contributor III
  • 4629 Views
  • 4 replies
  • 3 kudos

Resolved! Inline comment next to un-tickmarked SET statement = Syntax error

Running this code in databricks SQL works great:SET USE_CACHED_RESULT = FALSE;   -- Result: -- key value -- USE_CACHED_RESULT FALSEIf I add an inline comment, however, I get a syntax error:SET USE_CACHED_RESUL...

  • 4629 Views
  • 4 replies
  • 3 kudos
Latest Reply
rafal_walisko
New Contributor II
  • 3 kudos

Hi, I'm getting the same error when trying to execute statement through API "statement": "SET `USE_CACHED_RESULT` = FALSE; SELECT COUNT(*) FROM TABLE" Every combination fail  "status": { "state": "FAILED", "error": { "e...

  • 3 kudos
3 More Replies
Tico23
by Contributor
  • 14017 Views
  • 12 replies
  • 10 kudos

Connecting SQL Server (on-premise) to Databricks via jdbc:sqlserver

Is it possible to connect to SQL Server on-premise (Not Azure) from Databricks?I tried to ping my virtualbox VM (with Windows Server 2022) from within Databricks and the request timed out.%sh   ping 122.138.0.14This is what my connection might look l...

  • 14017 Views
  • 12 replies
  • 10 kudos
Latest Reply
BharathKumarS
New Contributor II
  • 10 kudos

I tried to connect to localhost sql server through databricks community edition, but it failed. I have created an IP rule on port 1433 allowed inbound connection from all public network, but still didn't connect. I tried locally using python its work...

  • 10 kudos
11 More Replies
prasadvaze
by Valued Contributor II
  • 20739 Views
  • 15 replies
  • 12 kudos

Resolved! How to query delta lake using SQL desktop tools like SSMS or DBVisualizer

Is there a way to use sql desktop tools? because delta OSS or databricks does not provide desktop client (similar to azure data studio) to browse and query delta lake objects.I currently use databricks SQL , a webUI in the databricks workspace but se...

  • 20739 Views
  • 15 replies
  • 12 kudos
Latest Reply
prasadvaze
Valued Contributor II
  • 12 kudos

DSR is Delta Standalone Reader. see more here - https://docs.delta.io/latest/delta-standalone.htmlIts a crate (and also now a py library) that allows you to connect to delta tables without using spark (e.g. directly from python and not using pyspa...

  • 12 kudos
14 More Replies
pranathisg97
by New Contributor III
  • 6086 Views
  • 3 replies
  • 0 kudos

Control query caching using SQL statement execution API

I want to execute this statement using databricks SQL Statement Execution API. curl -X POST -H 'Authorization: Bearer <access-token>' -H 'Content-Type: application/json' -d '{"warehouse_id": "<warehouse_id>", "statement": "set us...

image.png
  • 6086 Views
  • 3 replies
  • 0 kudos
Latest Reply
SumitAgrawal454
New Contributor II
  • 0 kudos

Looking for solution as I am facing the exact same problem

  • 0 kudos
2 More Replies
VVM
by New Contributor III
  • 15547 Views
  • 13 replies
  • 3 kudos

Resolved! Databricks SQL - Unable to Escape Dollar Sign ($) in Column Name

It seems that due to how Databricks processes SQL cells, it's impossible to escape the $ when it comes to a column name.I would expect the following to work:%sql SELECT 'hi' `$id`The backticks ought to escape everything. And indeed that's exactly wha...

  • 15547 Views
  • 13 replies
  • 3 kudos
Latest Reply
Pfizer
New Contributor II
  • 3 kudos

What is the status of this bug? This is affecting user experience.  

  • 3 kudos
12 More Replies
jonathan-dufaul
by Valued Contributor
  • 3346 Views
  • 6 replies
  • 6 kudos

Why is writing to MSSQL Server 12.0 so slow directly from spark but nearly instant when I write to a csv and read it back

I have a dataframe that inexplicably takes forever to write to an MS SQL Server even though other dataframes, even much larger ones, write nearly instantly. I'm using this code:my_dataframe.write.format("jdbc") .option("url",sqlsUrl) .optio...

  • 3346 Views
  • 6 replies
  • 6 kudos
Latest Reply
plondon
New Contributor II
  • 6 kudos

Had a similar issue. I can do 1-4 million rows in 1 minute via SSIS ETL on SQL server. Table is 15 fields long. Looking at your code it seems you have many fields but nothing like 300-400 fields which can affect performance. You can check SQL Server ...

  • 6 kudos
5 More Replies
Labels