Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
I have setup authentication using this page https://docs.databricks.com/sql/api/authentication.html and run curl -n -X GET https://<databricks-instance>.cloud.databricks.com/api/2.0/sql/history/queriesTo get history of all sql endpoint queries, but I...
Here's how to query with databricks-sdk-py (working code). I had a frustrating time doing it with vanilla python + requests/urllib and couldn't figure it out. import datetime
import os
from databricks.sdk import WorkspaceClient
from databricks.sdk.se...
I was trying to read some delta data from databricks[Hive metastore] sql endpoint using pyspark, but while doing so I encountered that all the values of the table after fetching are same as the column name.Even when I try to just show the data it giv...
Hi there @Sravan Burla Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you...
Hi, I'm wondering if this is the expected behavior when using last or last_value in a window function? I've written a query like this:select
col1,
col2,
last_value(col2) over (partition by col1 order by col2) as column2_last
from values
...
For those stumbling across this; it seems LAST_VALUE emulates the same functionality as it does in SQL Server which does not, in most people's minds, have a proper row/range frame for the window. You can adjust it with the below syntax.I understand l...
Reading table changes using a greater timestamp or version than the last table commit throws an error and can be changed using a flag timestampOutOfRange.enabled,My issue is that I use an SQL endpoint and I don't see any way of providing this spark f...
We have observed that an SQL endpoint has increased response times after a long time being idle. This endpoint is always running and does not terminate. Are there any checks/overheads due to being idle that could impact performance?
Hi @EDDatabricks EDDatabricks Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, pleas...
I have two tables EMPLOYEE & EMPLOYEE_ROLE. I'm trying to Update a column with a value from another column. I'm using SQL server join but i get an error - [parse_syntax_error] Syntax error at or near 'FROM' line 3. UPDATE CSET C.title = B.title FROM ...
Hi @Vijesh V Try to use merge into to perform cdc between tables :MERGE INTO target aUSING source bON {merge_condition}WHEN MATCHED THEN {matched_action}WHEN NOT MATCHED THEN {not_matched_action}
Hi,I am not able to create SQL Endpoint getting below error, I have selected Cluster size as 2X-Small on Azure platform:Clusters are failing to launch. Cluster launch will be retried.
Details for the latest failure: Error: Error code: PublicIPCountLi...
Hey there @Devashish Raverkar Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear f...
Hi Community,Let's take a scenario where the data from s3 is read to create delta table and then stored on dbfs, and then to query these delta table we used mysql endpoint from where all the delta tables are visible, but we need to control which all ...
Hey @Athlestan Jain Just checking in. Do you think you were able to find a solution to your problem from the above answers? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?Thank you!
want to create an external function using CREATE FUNCTION (External) and expose it to users of my SQL endpoint. Although this works from a SQL notebook, if I try to use the function from a SQL endpoint, I get "User defined expression is not supporte...
It is separated runtime https://docs.databricks.com/sql/release-notes/index.html#channels so it seems that it is not yet supported. There is CREATE FUNCTION documentation but it seems that it is support only SQL syntax https://docs.databricks.com/sql...
I am trying to run command to retrieve change data from sql endpoint. It is throwing below error."The input query contains unsupported data source(s).Only csv, json, avro, delta, parquet, orc, text data sources are supported on Databricks SQL."But th...
I'm trying to connect to Databricks using pyodbc and I'm running into an issue with struct columns. As far as I understand, struct columns and array columns are not supported by pyodbc, but they are converted to JSON. However, when there are nested c...
@Derk Crezee - I learned something today. Apparently ODBC does not convert to JSON. There is no defined spec on how to return complex types, in fact that was added only in SQL 2016. That's exactly what you are running into!End of history lesson Her...
Using Azure databricks, I have set up SQL Endpoint with the connection details that match with global init script. I am able to browse tables from regular cluster in Data Engineering module but i get below error when trying a query using SQL Endpoint...
@Prabakar Ammeappin @Kaniz Fatma Also I found out that after delta table is created in external metastore (and the table data resides in ADLS) then in the sql end point settings I do not need to provide ADLS connection details. I only provided...
You may have noticed that the local SQL endpoint is not listed in the options for getting started with APEX. The local SQL endpoint is an extremely useful feature for getting ADO.NET web services started. I say check this uk-dissertation.com review f...
I have created External table using spark via below command. (Using Data science & Engineering)df.write.mode("overwrite").format("parquet").saveAsTable(name=f'{db_name}.{table_name}', path="dbfs:/reports/testing")I have tried to delete a row based on...
hi @karthick J ,Can you try to delete the row and execute your command in a non high concurrency cluster? the reason why im asking this is because we first need to isolate the error message and undertand why is happening to be able to find the best ...