Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
Support of running multiple cells at a time in databricks notebookHi all,Now databricks notebook supports parallel run of commands in a single notebook that will help run ad hoc queries simultaneously without creating a separate notebook.Once you run...
Hi Team,I am observing that the functionality is not working as expected in the Trial workspace of Databricks. Is there a setting that needs to be enabled to allow independent SQL cells in a Databricks notebook to run in parallel, while dependent cel...
I have a field stored as a string in the format "12/30/2022 10:30:00 AM"If I use the function TO_DATE, I only get the date part... I want the full date and time.If I use the function TO_TIMESTAMP, I get the date and time, but it's assumed to be UTC, ...
I am new to Databricks. Please excuse my ignorance. My requirement is to convert the SQL query below into Databricks SQL. The query comes from EventLog table and the output of the query goes into EventSummaryThese queries can be found hereCREATE TABL...
you may explore the tool and services from Travinto Technologies . They have very good tools. We had explored their tool for our code coversion from Informatica, Datastage and abi initio to DATABRICKS , pyspark. Also we used for SQL queries, stored ...
Hi, there! I'm trying to build up my first dashboard based on Dataabricks SQL. As far as I can see if you define a query parameter you can't skip it further. I'm looking for any option where I can make my parameter optional. For instance, I have a ta...
I am working on Databricks version 10.4 premium cluster and while importing sql from databricks module I am getting below error. cannot import name 'sql' from 'databricks' (/databricks/python/lib/python3.8/site-packages/databricks/__init__.py).Trying...
if you ever received this kind of error after installing the correct Python package then try running the below command. dbutils.library.restartPython()
Currently, we are migrating from hive metastore to UC. We have several dashboards and a huge number of queries whose catalogs have been set to hive_metastore and using <db>.<table> access pattern.I'm just wondering if there's a way to switch catalogs...
working with delta files spark structure streaming , what is the maximum default chunk size in each batch?How do identify this type of spark configuration in databricks?#[Databricks SQL] #[Spark streaming] #[Spark structured streaming] #Spark
Running this code in databricks SQL works great:SET USE_CACHED_RESULT = FALSE;
-- Result:
-- key value
-- USE_CACHED_RESULT FALSEIf I add an inline comment, however, I get a syntax error:SET USE_CACHED_RESUL...
Hi, I'm getting the same error when trying to execute statement through API "statement": "SET `USE_CACHED_RESULT` = FALSE; SELECT COUNT(*) FROM TABLE" Every combination fail "status": {
"state": "FAILED",
"error": {
"e...
It seems that due to how Databricks processes SQL cells, it's impossible to escape the $ when it comes to a column name.I would expect the following to work:%sql
SELECT 'hi' `$id`The backticks ought to escape everything. And indeed that's exactly wha...
I know how to do it with spark, and read/write tables (like https://docs.microsoft.com/en-gb/azure/databricks/data/data-sources/sql-databases#python-example )But this time, I need to only update a field of a specific row in a table. I do not think I ...
thanks for the link.I am maybe wrong, but they describe how to connect with spark. They do not provide a connection engine that we could use directly (like with pyodbc) or an engine that we could use in pandas, for example.
How to use a SQL IDE with Databricks SQLDatabricks provides SQL IDE support using DataGrip and DBeaver with Databricks SQL.Let us know in the comments if you've used DataDrip or DBeaver with Databricks! Let us know if there are any other SQL IDEs you...
I want to pass yesterday date (In the example 20230115*.csv) in the csv file. Don't know how to create parameter and use it here.CREATE OR REPLACE TEMPORARY VIEW abc_delivery_logUSING CSVOPTIONS ( header="true", delimiter=",", inferSchema="true", pat...
@Retired_mod @sp1 @Chaitanya_Raju @daniel_sahal Hi Everyone,I need the same scenario on SQL code, because my DBR cluster not allowed me to run python codeError: Unsupported cell during execution. SQL warehouses only support executing SQL cells.I appr...
We are trying to improve our overall runtime by running queries in parallel using either multiprocessing or threads. What I am seeing though is that when the function that runs this code is run on a separate process it doesnt return a dataFrame with...
Thanks for the links mentioned above. But both of them uses raw python to achieve parallelism. Does this mean Spark (read PySpark) does exactly provisions for parallel execution of functions or even notebooks ? We used a wrapper notebook with ThreadP...
Rename and drop columns with Delta Lake column mapping. Hi all,Now databricks started supporting column rename and drop.Column mapping requires the following Delta protocols:Reader version 2 or above.Writer version 5 or above.Blog URL##Available in D...