Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
Due to dependencies, if one of our cells errors then we want the notebook to stop executing.We've noticed some odd behaviour when executing notebooks depending on if "Run all cells in this notebook" is selected from the header versus "Run All Below"....
Has this been implemented? I have created a job using notebook. My notebook has 6 cells and if the code in first cell fails it should not run the rest of the cells
I have a dataframe that inexplicably takes forever to write to an MS SQL Server even though other dataframes, even much larger ones, write nearly instantly. I'm using this code:my_dataframe.write.format("jdbc")
.option("url",sqlsUrl)
.optio...
Had a similar issue. I can do 1-4 million rows in 1 minute via SSIS ETL on SQL server. Table is 15 fields long. Looking at your code it seems you have many fields but nothing like 300-400 fields which can affect performance. You can check SQL Server ...
I have sql warehouse endpoints that work fine when querying from applications such as Tableau, but just running the included sample query against a running endpoint from the Query Editor from the workspace is returning "Unable to upload to DBFS Query...
Hi @Marvin Ginns​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...
This SQL statement works fine by itself SELECT COUNT(1) FROM tablea f INNER JOIN tableb t ON lower(f.col1) = t.col1but if I want to use it inside a function:​CREATE OR REPLACE FUNCTION fn_abc(var1 ...
Hello @Lily99 , I hope this message finds you well.
Could you please try the code below and let me know the results?
CREATE OR REPLACE FUNCTION fn_abc(var1 STRING, var2 STRING)
RETURNS DOUBLECOMMENT 'test function'RETURN SELECT CASE WHEN EXISTS...
I'm utilizing SQL to perform aggregation operations within a gold layer of a DLT pipeline. However, I'm encountering an error when running the pipeline while attempting to return a data frame using spark.sql.Could anyone please assist me with the SQL...
Hello @Yash_542965 , I hope this message finds you well.
Could you please share a sample of code you are using so that we can check it further?
Best regards,Lucas Rocha
HelloAfter creating an Databricks Workspace in Azure with No Public IP and VNET Injection, I'm unable to use DBSQL Serverless because the option to enable it in SQL warehouse Settings is missing. ​Is it by design? Is it a limitation when using Privat...
I want to add a column to an existing delta table with a timestamp for when the data was inserted. I know I can do this by including current_timestamp with my SQL statement that inserts into the table. Is it possible to add a column to an existing de...
Im looking at using Databricks internally for some Data Science projects. I am however very confused to how the pricing works and would like to obviously avoid high spending right now. Internal documentation and within Databricks All-Purpose Compute...
Hello,I was able to get a very precise cost of Azure Databricks Clusters and Computers jobs, using the Microsoft API and Databricks APIThen I wrote a simple tool to extract and manipulate the API results and generate detailed cost reports that can be...
I am trying to send email alerts to a non databricks user. I am using Alerts feature available in SQL. Can someone help me with the steps.Do I first need to first add Notification Destination through Admin settings and then use this newly added desti...
Hi @Mitali Lad​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...
I am trying to connect to SQL through JDBC from databricks notebook. (Below is my notebook command)val df = spark.read.jdbc(jdbcUrl, "[MyTableName]", connectionProperties)
println(df.schema)When I execute this command, with DBR 10.4 LTS it works fin...
Try to add the following parameters to your SQL connection string. It fixed my problem for 13.X and 12.X;trustServerCertificate=true;hostNameInCertificate=*.database.windows.net;
I want to pass yesterday date (In the example 20230115*.csv) in the csv file. Don't know how to create parameter and use it here.CREATE OR REPLACE TEMPORARY VIEW abc_delivery_logUSING CSVOPTIONS ( header="true", delimiter=",", inferSchema="true", pat...
@Kaniz_Fatma @sp1 @Chaitanya_Raju @daniel_sahal Hi Everyone,I need the same scenario on SQL code, because my DBR cluster not allowed me to run python codeError: Unsupported cell during execution. SQL warehouses only support executing SQL cells.I appr...
Hi,I try to use java sql. i can see that the query on databricks is executed properly.However, on my client i get exception (see below).versions:jdk: jdk-20.0.1 (tryed also with version 16, same results)https://www.oracle.com/il-en/java/technologies/...
Hey,I have a repo of notebooks and SQL files, the typical way is to update/create notebooks in the repo then push it and CICD pipeline deploys the notebooks to the Shared workspace.the issue is that I can access the SQL files in the Repo but can not ...
It would be possible to activate a custom extensions like Sedona (https://sedona.apache.org/download/databricks/ ) in SQL Endopoints?Example error:java.lang.ClassNotFoundException: org.apache.spark.sql.sedona_sql.UDT.GeometryUDT at org.apache.spark....
Hi,I have a date column in a delta table called ADate. I need this in the format YYYYMMDD.In TSQL this is easy. However, I can't seem to be able to do this without splitting the YEAR, MONTH and Day and concatenating them together.Any ideas?
I'll share I'm having a variant of the same issue. I have a varchar field in the form YYYYMMDD which I'm trying to join to another varchar field from another table in the form of MM/DD/YYYY. Does anyone know of a way to do this in SPARK SQL without s...