cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

jorperort
by Contributor
  • 4872 Views
  • 3 replies
  • 6 kudos

Resolved! Help with Integration Testing for SQL Notebooks in Databricks

Hi everyone,I’m looking for the best way to implement integration tests for SQL notebooks in an environment that uses Unity Catalog and workflows to execute these notebooks.For unit tests on SQL functions, I’ve reviewed the https://docs.databricks.co...

  • 4872 Views
  • 3 replies
  • 6 kudos
Latest Reply
filipniziol
Esteemed Contributor
  • 6 kudos

Hi @jorperort ,I see the question is already answered, but your question motivated me to create an article in medium and also to create a sample repo with the integration test written for SQL notebook.I hope it will be of useful for you:https://filip...

  • 6 kudos
2 More Replies
EssamHisham
by New Contributor II
  • 1350 Views
  • 2 replies
  • 3 kudos

Lakehouse Fundamentals Course

I have an issue I encountered while attempting to access the Lakehouse Fundamentals badge quiz. Every time I try to access the quiz, I receive the following error message: "Access denied. You do not have permission to access this page. Please contact...

  • 1350 Views
  • 2 replies
  • 3 kudos
Latest Reply
Walter_C
Databricks Employee
  • 3 kudos

If this does not help you can reach out to training-ops@databricks.com

  • 3 kudos
1 More Replies
jorperort
by Contributor
  • 1883 Views
  • 2 replies
  • 2 kudos

Resolved! Wap pattern unity catalog

Good afternoon,I am looking for documentation to implement the WAP pattern using Unity Catalog, workflows, SQL notebooks, and any other services necessary to use this pattern. Could you share information on how to approach the problem with documentat...

  • 1883 Views
  • 2 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 2 kudos

Hi @jorperort ,Apart from nice step by step instruction that @VZLA has provided, you can also take a look at short presentation of WAP pattern at the official databricks YT channel:https://youtu.be/4K3zAmUgViE?t=492

  • 2 kudos
1 More Replies
alwaysmoredata
by New Contributor II
  • 2488 Views
  • 8 replies
  • 1 kudos

Is it possible to load data only using Databricks SDK?

Is it possible to load data only using Databricks SDK?I have custom library that has to load data to a table, and I know about other features like autoloader, COPY INTO, notebook with spark dataframe... but I wonder if it is possible to load data dir...

  • 2488 Views
  • 8 replies
  • 1 kudos
Latest Reply
Walter_C
Databricks Employee
  • 1 kudos

Got it, the reason of the cluster is because with shared access cluster the access to local system is more restricted that with single user cluster due to security constraints. As you are using serverless it acts as a shared cluster, on this case you...

  • 1 kudos
7 More Replies
jeremy98
by Honored Contributor
  • 918 Views
  • 2 replies
  • 0 kudos

Submit new records from gold layer to postgres db

Hi community,I want to ask you which is the best practice in your opinion to fill the data from gold layer to postgres db that is used to provide "real-time" data to an UI interface?Thanks for any help!

  • 918 Views
  • 2 replies
  • 0 kudos
Latest Reply
jeremy98
Honored Contributor
  • 0 kudos

Hello @hari-prasad, Thanks you for your answer, considers that we are using DLT Pipelines, is it a good choice in this case? Because actually I don't see with DLT Pipelines the metadata for these materialized tables in this case these CDF statements

  • 0 kudos
1 More Replies
mkEngineer
by New Contributor III
  • 6535 Views
  • 1 replies
  • 0 kudos

Refresh options on PBI from Databricks workflow using Azure Databricks

Hi!I have a workflow that includes my medallion architecture and DLT. Currently, I have a separate notebook for refreshing my Power BI semantic model, which works based on the method described in Refresh a PowerBI dataset from Azure Databricks.  Howe...

  • 6535 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @mkEngineer, Have you reviewed this documentation: https://learn.microsoft.com/en-us/azure/databricks/partners/bi/power-bi Also I don't think Serverless compute for Notebook will work for your connection with Power BI. You might need to setup a Se...

  • 0 kudos
Anonymous
by Not applicable
  • 59137 Views
  • 7 replies
  • 13 kudos

How to connect and extract data from sharepoint using Databricks (AWS) ?

We are using Databricks (on AWS). We need to connect to SharePoint and extract & load data to Databricks Delta table. Any possible solution on this ?

  • 59137 Views
  • 7 replies
  • 13 kudos
Latest Reply
yliu
New Contributor III
  • 13 kudos

Wondering the same.. Can we use Sharepoint REST API to download the file and save to dbfs/external location and read it? 

  • 13 kudos
6 More Replies
Phani1
by Databricks MVP
  • 947 Views
  • 1 replies
  • 1 kudos

Access the data from cross-cloud.

Hi All ,We have a use case  where we need to connect AWS Databricks to a GCP storage bucket to access the data. In Databricks We're trying to use external locations and storage credentials, but it seems like AWS Databricks only supports AWS storage b...

  • 947 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @Phani1 ,You can use delta sharing. In that way you can create share that will allow you to access data stored in GCS and it's govern by UC permissions model.What is Delta Sharing? | Databricks on AWSYou can also use legacy approach, but it doesn'...

  • 1 kudos
svm_varma
by New Contributor II
  • 3753 Views
  • 1 replies
  • 2 kudos

Resolved! Azure Databricks quota restrictions on compute in Azure for students subscription

Hi All,Regrading creating clusters in Databricks I'm getting quota error have tried to increase quotas in the region where the resource is hosted still unable to increase the limit, is there any workaround  or could you help select the right cluster ...

svm_varma_1-1735552504129.png svm_varma_0-1735552319815.png svm_varma_2-1735552549290.png
  • 3753 Views
  • 1 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 2 kudos

Hi @svm_varma ,You can try to create Standard_DS3_v2 cluster. It has 4 cores and your current subscription limit for given region is 6 cores. The one you're trying to create needs 8 cores and hence you're getting quota exceeded exception.You can also...

  • 2 kudos
vijaypodili
by New Contributor III
  • 2400 Views
  • 9 replies
  • 0 kudos

databricks job taking longer time to load 2.3 gb data from bolb to ssms table

df_CorpBond= spark.read.format("parquet").option("header", "true").load(f"/mnt/{container_name}/raw_data/dsl.corporate.parquet") df_CorpBond.repartition(20).write\ .format("jdbc")\ .option("url", url_connector)\ .option("dbtable", "MarkIt...

Data Engineering
datarbricks
performance
  • 2400 Views
  • 9 replies
  • 0 kudos
Latest Reply
vijaypodili
New Contributor III
  • 0 kudos

Hi @RiyazAliM this is my dag digramfile size is 3.5 gb and in future we need to load 14gb as well

  • 0 kudos
8 More Replies
singhanuj2803
by Contributor
  • 5513 Views
  • 1 replies
  • 1 kudos

Apache Spark SQL query to get organization hierarchy

I'm currently diving deep into Spark SQL and its capabilities, and I'm facing an interesting challenge. I'm eager to learn how to write CTE recursive queries in Spark SQL, but after thorough research, it seems that Spark doesn't natively support recu...

rr.png RR1.png
  • 5513 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @singhanuj2803, It is correct that Spark SQL does not natively support recursive Common Table Expressions (CTEs). However, there are some workarounds and alternative methods you can use to achieve similar results.   Using DataFrame API with Loops:...

  • 1 kudos
singhanuj2803
by Contributor
  • 1252 Views
  • 1 replies
  • 1 kudos

How to run stored procedure in Azure Database for PostgreSQL using Azure Databricks Notebook

We have Stored Procedure available in Azure Database for PostgreSQL and we want to call or run or execute the postgreSQL stored procedures in Azure Databricks through NotebookWe are attempting to run PostgreSQL stored procedures, through Azure Databr...

  • 1252 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

To execute a PostgreSQL stored procedure from an Azure Databricks notebook, you need to follow these steps: Required Libraries:You need to install the psycopg2 library, which is a PostgreSQL adapter for Python. This can be done using the %pip install...

  • 1 kudos
singhanuj2803
by Contributor
  • 4557 Views
  • 1 replies
  • 0 kudos

Resolved! How to execute SQL stored procedure in Azure Database for SQL Server using Azure Databricks Notebook

We have Stored Procedure available in Azure Database for SQL Server and we want to call or run or execute the postgreSQL stored procedures in Azure Databricks through NotebookWe are attempting to run SQL stored procedures, through Azure Databricks no...

  • 4557 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @singhanuj2803, To execute a SQL stored procedure in Azure Databricks, you can follow these steps: Required Libraries:You need to install the pyodbc library to connect to Azure SQL Database using ODBC. You can install it using the following comman...

  • 0 kudos
h2p5cq8
by New Contributor III
  • 5857 Views
  • 3 replies
  • 3 kudos

Resolved! Deleting records from Delta table that are not in relational table

I have a Delta table that I keep in sync with a relational (SQL Server) table. The inserts and updates are easy but checking for records to delete is prohibitively slow. I am querying the relational table for all primary key values and any primary ke...

  • 5857 Views
  • 3 replies
  • 3 kudos
Latest Reply
hari-prasad
Valued Contributor II
  • 3 kudos

Let's understand the complexity behind this code when executed on delta table along with Spark.pks = spark.read.format("jdbc").option("query": "SELECT pk FROM sql_table_name").load() delta_table = spark.read.table(delta_table_name) r = target_table.f...

  • 3 kudos
2 More Replies
Labels