Data Engineering

Forum Posts

Sorted by:

by jorperort • Contributor

12-02-2024 4:16:17 AM

4872 Views
3 replies
6 kudos

Resolved! Help with Integration Testing for SQL Notebooks in Databricks

Hi everyone,I’m looking for the best way to implement integration tests for SQL notebooks in an environment that uses Unity Catalog and workflows to execute these notebooks.For unit tests on SQL functions, I’ve reviewed the https://docs.databricks.co...

Data Engineering

4872 Views
3 replies
6 kudos

12-02-2024 4:16:17 AM

View Replies

Latest Reply

filipniziol
Esteemed Contributor

12-30-2024 8:29:13 AM

6 kudos

Hi @jorperort ,I see the question is already answered, but your question motivated me to create an article in medium and also to create a sample repo with the integration test written for SQL notebook.I hope it will be of useful for you:https://filip...

6 kudos

12-30-2024 8:29:13 AM

2 More Replies

by EssamHisham • New Contributor II

12-30-2024 8:38:52 AM

1350 Views
2 replies
3 kudos

Lakehouse Fundamentals Course

I have an issue I encountered while attempting to access the Lakehouse Fundamentals badge quiz. Every time I try to access the quiz, I receive the following error message: "Access denied. You do not have permission to access this page. Please contact...

Data Engineering

1350 Views
2 replies
3 kudos

12-30-2024 8:38:52 AM

View Replies

Latest Reply

Walter_C
Databricks Employee

12-30-2024 8:50:03 AM

3 kudos

If this does not help you can reach out to training-ops@databricks.com

3 kudos

12-30-2024 8:50:03 AM

1 More Replies

by jorperort • Contributor

12-30-2024 8:30:24 AM

1883 Views
2 replies
2 kudos

Resolved! Wap pattern unity catalog

Good afternoon,I am looking for documentation to implement the WAP pattern using Unity Catalog, workflows, SQL notebooks, and any other services necessary to use this pattern. Could you share information on how to approach the problem with documentat...

Data Engineering

1883 Views
2 replies
2 kudos

12-30-2024 8:30:24 AM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

12-30-2024 8:44:44 AM

2 kudos

Hi @jorperort ,Apart from nice step by step instruction that @VZLA has provided, you can also take a look at short presentation of WAP pattern at the official databricks YT channel:https://youtu.be/4K3zAmUgViE?t=492

2 kudos

12-30-2024 8:44:44 AM

1 More Replies

by alwaysmoredata • New Contributor II

12-30-2024 6:26:38 AM

2488 Views
8 replies
1 kudos

Is it possible to load data only using Databricks SDK?

Is it possible to load data only using Databricks SDK?I have custom library that has to load data to a table, and I know about other features like autoloader, COPY INTO, notebook with spark dataframe... but I wonder if it is possible to load data dir...

Data Engineering

2488 Views
8 replies
1 kudos

12-30-2024 6:26:38 AM

View Replies

Latest Reply

Walter_C
Databricks Employee

12-30-2024 7:20:27 AM

1 kudos

Got it, the reason of the cluster is because with shared access cluster the access to local system is more restricted that with single user cluster due to security constraints. As you are using serverless it acts as a shared cluster, on this case you...

1 kudos

12-30-2024 7:20:27 AM

7 More Replies

by jeremy98 • Honored Contributor

12-28-2024 9:21:35 AM

918 Views
2 replies
0 kudos

Submit new records from gold layer to postgres db

Hi community,I want to ask you which is the best practice in your opinion to fill the data from gold layer to postgres db that is used to provide "real-time" data to an UI interface?Thanks for any help!

Data Engineering

918 Views
2 replies
0 kudos

12-28-2024 9:21:35 AM

View Replies

Latest Reply

jeremy98
Honored Contributor

12-30-2024 5:31:25 AM

0 kudos

Hello @hari-prasad, Thanks you for your answer, considers that we are using DLT Pipelines, is it a good choice in this case? Because actually I don't see with DLT Pipelines the metadata for these materialized tables in this case these CDF statements

0 kudos

12-30-2024 5:31:25 AM

1 More Replies

by mkEngineer • New Contributor III

12-30-2024 2:32:13 AM

6535 Views
1 replies
0 kudos

Refresh options on PBI from Databricks workflow using Azure Databricks

Hi!I have a workflow that includes my medallion architecture and DLT. Currently, I have a separate notebook for refreshing my Power BI semantic model, which works based on the method described in Refresh a PowerBI dataset from Azure Databricks. Howe...

Data Engineering

6535 Views
1 replies
0 kudos

12-30-2024 2:32:13 AM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

12-30-2024 5:23:53 AM

0 kudos

Hi @mkEngineer, Have you reviewed this documentation: https://learn.microsoft.com/en-us/azure/databricks/partners/bi/power-bi Also I don't think Serverless compute for Notebook will work for your connection with Power BI. You might need to setup a Se...

0 kudos

12-30-2024 5:23:53 AM

by Anonymous • Not applicable

03-30-2022 3:39:44 AM

59137 Views
7 replies
13 kudos

How to connect and extract data from sharepoint using Databricks (AWS) ?

We are using Databricks (on AWS). We need to connect to SharePoint and extract & load data to Databricks Delta table. Any possible solution on this ?

Data Engineering

59137 Views
7 replies
13 kudos

03-30-2022 3:39:44 AM

View Replies

Latest Reply

yliu
New Contributor III

11-10-2023 8:03:15 AM

13 kudos

Wondering the same.. Can we use Sharepoint REST API to download the file and save to dbfs/external location and read it?

13 kudos

11-10-2023 8:03:15 AM

6 More Replies

by Phani1 • Databricks MVP

12-29-2024 11:14:05 PM

947 Views
1 replies
1 kudos

Access the data from cross-cloud.

Hi All ,We have a use case where we need to connect AWS Databricks to a GCP storage bucket to access the data. In Databricks We're trying to use external locations and storage credentials, but it seems like AWS Databricks only supports AWS storage b...

Data Engineering

947 Views
1 replies
1 kudos

12-29-2024 11:14:05 PM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

12-30-2024 4:26:36 AM

1 kudos

Hi @Phani1 ,You can use delta sharing. In that way you can create share that will allow you to access data stored in GCS and it's govern by UC permissions model.What is Delta Sharing? | Databricks on AWSYou can also use legacy approach, but it doesn'...

1 kudos

12-30-2024 4:26:36 AM

by svm_varma • New Contributor II

12-30-2024 1:59:29 AM

3753 Views
1 replies
2 kudos

Resolved! Azure Databricks quota restrictions on compute in Azure for students subscription

Hi All,Regrading creating clusters in Databricks I'm getting quota error have tried to increase quotas in the region where the resource is hosted still unable to increase the limit, is there any workaround or could you help select the right cluster ...

Data Engineering

3753 Views
1 replies
2 kudos

12-30-2024 1:59:29 AM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

12-30-2024 2:14:13 AM

2 kudos

Hi @svm_varma ,You can try to create Standard_DS3_v2 cluster. It has 4 cores and your current subscription limit for given region is 6 cores. The one you're trying to create needs 8 cores and hence you're getting quota exceeded exception.You can also...

2 kudos

12-30-2024 2:14:13 AM

by vijaypodili • New Contributor III

12-20-2024 10:35:17 PM

2400 Views
9 replies
0 kudos

databricks job taking longer time to load 2.3 gb data from bolb to ssms table

df_CorpBond= spark.read.format("parquet").option("header", "true").load(f"/mnt/{container_name}/raw_data/dsl.corporate.parquet") df_CorpBond.repartition(20).write\ .format("jdbc")\ .option("url", url_connector)\ .option("dbtable", "MarkIt...

Data Engineering

datarbricks

performance

2400 Views
9 replies
0 kudos

12-20-2024 10:35:17 PM

View Replies

Latest Reply

vijaypodili
New Contributor III

12-29-2024 10:43:55 PM

0 kudos

Hi @RiyazAliM this is my dag digramfile size is 3.5 gb and in future we need to load 14gb as well

0 kudos

12-29-2024 10:43:55 PM

8 More Replies

by Akash_Wadhankar • Databricks Partner

12-29-2024 6:28:23 PM

2079 Views
0 replies
0 kudos

Dataiku and Databricks a promising future to democratize AI.

Democratize Data and AI is the prime focus of organization in 2025. With this some of the products that are going to fly with their offerings are Dataiku and Databricks.Please check my article on this:https://medium.com/@infinitylearnings1201/databri...

Data Engineering

2079 Views
0 replies
0 kudos

12-29-2024 6:28:23 PM

by singhanuj2803 • Contributor

12-29-2024 8:44:54 AM

5513 Views
1 replies
1 kudos

Apache Spark SQL query to get organization hierarchy

I'm currently diving deep into Spark SQL and its capabilities, and I'm facing an interesting challenge. I'm eager to learn how to write CTE recursive queries in Spark SQL, but after thorough research, it seems that Spark doesn't natively support recu...

Data Engineering

5513 Views
1 replies
1 kudos

12-29-2024 8:44:54 AM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

12-29-2024 2:45:13 PM

1 kudos

Hi @singhanuj2803, It is correct that Spark SQL does not natively support recursive Common Table Expressions (CTEs). However, there are some workarounds and alternative methods you can use to achieve similar results. Using DataFrame API with Loops:...

1 kudos

12-29-2024 2:45:13 PM

by singhanuj2803 • Contributor

12-29-2024 8:49:08 AM

1252 Views
1 replies
1 kudos

How to run stored procedure in Azure Database for PostgreSQL using Azure Databricks Notebook

We have Stored Procedure available in Azure Database for PostgreSQL and we want to call or run or execute the postgreSQL stored procedures in Azure Databricks through NotebookWe are attempting to run PostgreSQL stored procedures, through Azure Databr...

Data Engineering

1252 Views
1 replies
1 kudos

12-29-2024 8:49:08 AM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

12-29-2024 2:41:25 PM

1 kudos

To execute a PostgreSQL stored procedure from an Azure Databricks notebook, you need to follow these steps: Required Libraries:You need to install the psycopg2 library, which is a PostgreSQL adapter for Python. This can be done using the %pip install...

1 kudos

12-29-2024 2:41:25 PM

by singhanuj2803 • Contributor

12-29-2024 8:51:28 AM

4557 Views
1 replies
0 kudos

Resolved! How to execute SQL stored procedure in Azure Database for SQL Server using Azure Databricks Notebook

We have Stored Procedure available in Azure Database for SQL Server and we want to call or run or execute the postgreSQL stored procedures in Azure Databricks through NotebookWe are attempting to run SQL stored procedures, through Azure Databricks no...

Data Engineering

4557 Views
1 replies
0 kudos

12-29-2024 8:51:28 AM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

12-29-2024 2:37:22 PM

0 kudos

Hi @singhanuj2803, To execute a SQL stored procedure in Azure Databricks, you can follow these steps: Required Libraries:You need to install the pyodbc library to connect to Azure SQL Database using ODBC. You can install it using the following comman...

0 kudos

12-29-2024 2:37:22 PM

by h2p5cq8 • New Contributor III

12-27-2024 2:40:47 PM

5857 Views
3 replies
3 kudos

Resolved! Deleting records from Delta table that are not in relational table

I have a Delta table that I keep in sync with a relational (SQL Server) table. The inserts and updates are easy but checking for records to delete is prohibitively slow. I am querying the relational table for all primary key values and any primary ke...

Data Engineering

5857 Views
3 replies
3 kudos

12-27-2024 2:40:47 PM

View Replies

Latest Reply

hari-prasad
Valued Contributor II

12-28-2024 11:24:05 PM

3 kudos

Let's understand the complexity behind this code when executed on delta table along with Spark.pks = spark.read.format("jdbc").option("query": "SELECT pk FROM sql_table_name").load() delta_table = spark.read.table(delta_table_name) r = target_table.f...

3 kudos

12-28-2024 11:24:05 PM

2 More Replies

Databricks Community

Forum Posts

Resolved! Help with Integration Testing for SQL Notebooks in Databricks

Lakehouse Fundamentals Course

Resolved! Wap pattern unity catalog

Is it possible to load data only using Databricks SDK?

Submit new records from gold layer to postgres db

Refresh options on PBI from Databricks workflow using Azure Databricks

How to connect and extract data from sharepoint using Databricks (AWS) ?

Access the data from cross-cloud.

Resolved! Azure Databricks quota restrictions on compute in Azure for students subscription

databricks job taking longer time to load 2.3 gb data from bolb to ssms table

Dataiku and Databricks a promising future to democratize AI.

Apache Spark SQL query to get organization hierarchy

How to run stored procedure in Azure Database for PostgreSQL using Azure Databricks Notebook

Resolved! How to execute SQL stored procedure in Azure Database for SQL Server using Azure Databricks Notebook

Resolved! Deleting records from Delta table that are not in relational table

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template