cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

singhanuj2803
by New Contributor III
  • 2149 Views
  • 1 replies
  • 1 kudos

Apache Spark SQL query to get organization hierarchy

I'm currently diving deep into Spark SQL and its capabilities, and I'm facing an interesting challenge. I'm eager to learn how to write CTE recursive queries in Spark SQL, but after thorough research, it seems that Spark doesn't natively support recu...

rr.png RR1.png
  • 2149 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @singhanuj2803, It is correct that Spark SQL does not natively support recursive Common Table Expressions (CTEs). However, there are some workarounds and alternative methods you can use to achieve similar results.   Using DataFrame API with Loops:...

  • 1 kudos
singhanuj2803
by New Contributor III
  • 435 Views
  • 1 replies
  • 1 kudos

How to run stored procedure in Azure Database for PostgreSQL using Azure Databricks Notebook

We have Stored Procedure available in Azure Database for PostgreSQL and we want to call or run or execute the postgreSQL stored procedures in Azure Databricks through NotebookWe are attempting to run PostgreSQL stored procedures, through Azure Databr...

  • 435 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

To execute a PostgreSQL stored procedure from an Azure Databricks notebook, you need to follow these steps: Required Libraries:You need to install the psycopg2 library, which is a PostgreSQL adapter for Python. This can be done using the %pip install...

  • 1 kudos
singhanuj2803
by New Contributor III
  • 1426 Views
  • 1 replies
  • 0 kudos

Resolved! How to execute SQL stored procedure in Azure Database for SQL Server using Azure Databricks Notebook

We have Stored Procedure available in Azure Database for SQL Server and we want to call or run or execute the postgreSQL stored procedures in Azure Databricks through NotebookWe are attempting to run SQL stored procedures, through Azure Databricks no...

  • 1426 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @singhanuj2803, To execute a SQL stored procedure in Azure Databricks, you can follow these steps: Required Libraries:You need to install the pyodbc library to connect to Azure SQL Database using ODBC. You can install it using the following comman...

  • 0 kudos
h2p5cq8
by New Contributor III
  • 1996 Views
  • 3 replies
  • 3 kudos

Resolved! Deleting records from Delta table that are not in relational table

I have a Delta table that I keep in sync with a relational (SQL Server) table. The inserts and updates are easy but checking for records to delete is prohibitively slow. I am querying the relational table for all primary key values and any primary ke...

  • 1996 Views
  • 3 replies
  • 3 kudos
Latest Reply
hari-prasad
Valued Contributor II
  • 3 kudos

Let's understand the complexity behind this code when executed on delta table along with Spark.pks = spark.read.format("jdbc").option("query": "SELECT pk FROM sql_table_name").load() delta_table = spark.read.table(delta_table_name) r = target_table.f...

  • 3 kudos
2 More Replies
filipniziol
by Esteemed Contributor
  • 1965 Views
  • 9 replies
  • 4 kudos

Resolved! Magic Commands (%sql) Not Working with Databricks Extension for VS Code

Hi Community,I’ve encountered an issue with the Databricks Extension for VS Code that seems to contradict the documentation. According to the Databricks documentation, the extension supports magic commands like %sql when used with Databricks Connect:...

filipniziol_0-1734692630751.png
  • 1965 Views
  • 9 replies
  • 4 kudos
Latest Reply
jack533
New Contributor III
  • 4 kudos

In reality, there is nothing to do with grpc_wait_for_shutdown_with_timeout. Although we haven't yet implemented a solution, we have an open issue for it, but it shouldn't stop SQL magic from working.Or is the "Interactive" tab where you encounter th...

  • 4 kudos
8 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 1786 Views
  • 7 replies
  • 2 kudos

Databricks Job cluster for continuous run

Hi AllI am having situation where I wanted to run job as continuous trigger by using job cluster, cluster terminating and re-creating in every run within continuous trigger.I just wanted two know if we have any option where I can use same job cluster...

AjayPandey_0-1728973783760.png
  • 1786 Views
  • 7 replies
  • 2 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 2 kudos

@Ajay-Pandey cant we achieve the similar functionalities with the help of cluster Pools , why don't you try cluster pools.

  • 2 kudos
6 More Replies
Younevano
by New Contributor III
  • 3942 Views
  • 13 replies
  • 10 kudos

Suddenly can't find the option to uplaod files into Databricks Community Edition

Hi everyone,I am suddenly unable to find the option to upload my files into Databricks Community Edition today. Please find the same in the screenshot attached. Is anyone else also facing this issue?

  • 3942 Views
  • 13 replies
  • 10 kudos
Latest Reply
geraldhopkins
New Contributor II
  • 10 kudos

GChandra, did it get removed again? I just signed up for Community free edition, and I don't see an option to Upload files. Please advise.

  • 10 kudos
12 More Replies
Gianfranco
by New Contributor II
  • 1308 Views
  • 3 replies
  • 1 kudos

Deleting Records from DLT Bronze and Silver Tables

I have a pipeline that generates two DLT streaming tables: a Bronze table and a Silver table. I need to delete specific records from both tables. I've read an article (https://www.databricks.com/blog/handling-right-be-forgotten-gdpr-and-ccpa-using-de...

  • 1308 Views
  • 3 replies
  • 1 kudos
Latest Reply
karthickrs
New Contributor II
  • 1 kudos

Remove records using the DELETE operation in both Bronze & Silver tables.After doing each delete step, you can Optimize the table which rewrites the parquet files for that table behind the scenes to improve the data layout (Read more about optimize h...

  • 1 kudos
2 More Replies
vannipart
by New Contributor III
  • 1719 Views
  • 2 replies
  • 0 kudos

Volumes unzip files

I have this shell unzip that I use to unzip files %shsudo apt-get updatesudo apt-get install -y p7zip-full But when it comes to new workspace, I get error sudo: a terminal is required to read the password; either use the -S option to read from standa...

  • 1719 Views
  • 2 replies
  • 0 kudos
Latest Reply
karthickrs
New Contributor II
  • 0 kudos

First, you can read the ZIP file in a binary format [ spark.read.format("binaryFile") ], then use the zipfile Python package to unzip and extract all the files from the zipped file and store them in a Volume.

  • 0 kudos
1 More Replies
mmceld1
by New Contributor II
  • 332 Views
  • 1 replies
  • 1 kudos

Resolved! Does Autoloader Detect New Records in a Snowflake Table or Only Work With Files?

The only thing I can find with autoloader is picking up new files, nothing about new records in an existing snowflake table.

  • 332 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @mmceld1,  Autoloader is for cloud storage files. You can achieve similar functionality by using Delta Lake and its capabilities for handling slowly changing dimensions (SCD Type 2) and change data capture (CDC)

  • 1 kudos
lauraxyz
by Contributor
  • 395 Views
  • 1 replies
  • 1 kudos

Resolved! Programmatically edit notebook

I have a job to move notebook from Volume to workspace, then execute it with dbutils.notebook.run(). Instead of directly running the notebook, i want to append some logic (i.e. Save results to a certain able) at the end of the notebook, is there a su...

  • 395 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @lauraxyz, Currently, there is no built-in feature in Databricks that directly supports appending logic to a notebook before execution, so treating the notebook as a regular file and modifying its content is a practical solution.

  • 1 kudos
ossoul
by New Contributor
  • 1638 Views
  • 1 replies
  • 1 kudos

Not able to get spark application in Spark History server using cluster eventlogs

I'm encountering an issue with incomplete Spark event logs. When I am running the local Spark History Server using the cluster logs, my application appears as "incomplete". Sometime I also see few queries listed as still running, even though the appl...

  • 1638 Views
  • 1 replies
  • 1 kudos
Latest Reply
VZLA
Databricks Employee
  • 1 kudos

Thanks for your question! I believe Databricks has its own SHS implementation, so it's not expected to work with the vanilla SHS. Regarding the queries marked as still running, we can also find this when there are event logs which were not properly c...

  • 1 kudos
ashraf1395
by Honored Contributor
  • 1309 Views
  • 1 replies
  • 0 kudos

Schema issue while fetching data from oracle

I dont have the complete context of the issue.But Here it is what I know, a friend of mine facing this""I am fetching data from Oracle data in databricks using python.But every time i do it the schema gets changesso if the column is of type decimal f...

  • 1309 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Thanks for your question!To address schema issues when fetching Oracle data in Databricks, use JDBC schema inference to define data types programmatically or batch-cast columns dynamically after loading. For performance, enable predicate pushdown and...

  • 0 kudos
chris_b
by New Contributor
  • 1300 Views
  • 1 replies
  • 0 kudos

Increase Stack Size for Python Subprocess

I need to increase the stack size (from the default of 16384) to run a subprocess that requires a larger stack size.I tried following this: https://community.databricks.com/t5/data-engineering/increase-stack-size-databricks/td-p/71492And this: https:...

  • 1300 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Thanks for your question!Are you referring to a Java stack size (-Xss) or a Python subprocess (ulimit -s)?

  • 0 kudos
upatint07
by New Contributor II
  • 1329 Views
  • 1 replies
  • 0 kudos

Facing Issue in "import dlt" using Databricks Runtime 14.3 LTS version

Facing issues while Importing dlt library in Databricks Runtime 14.3 LTS. Previously while using the Runtime 13.1 The `import dlt` was working fine but now when updating the Runtime version to 14.3 LTS it is giving me error. 

upatint07_0-1724733273085.png upatint07_1-1724733284564.png
  • 1329 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Thanks for your question! Unfortunately, this is actually a known limitation with Spark Connect clusters.

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels