cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

jeremy98
by Contributor
  • 319 Views
  • 2 replies
  • 0 kudos

Submit new records from gold layer to postgres db

Hi community,I want to ask you which is the best practice in your opinion to fill the data from gold layer to postgres db that is used to provide "real-time" data to an UI interface?Thanks for any help!

  • 319 Views
  • 2 replies
  • 0 kudos
Latest Reply
jeremy98
Contributor
  • 0 kudos

Hello @hari-prasad, Thanks you for your answer, considers that we are using DLT Pipelines, is it a good choice in this case? Because actually I don't see with DLT Pipelines the metadata for these materialized tables in this case these CDF statements

  • 0 kudos
1 More Replies
mkEngineer
by New Contributor III
  • 375 Views
  • 1 replies
  • 0 kudos

Refresh options on PBI from Databricks workflow using Azure Databricks

Hi!I have a workflow that includes my medallion architecture and DLT. Currently, I have a separate notebook for refreshing my Power BI semantic model, which works based on the method described in Refresh a PowerBI dataset from Azure Databricks.  Howe...

  • 375 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @mkEngineer, Have you reviewed this documentation: https://learn.microsoft.com/en-us/azure/databricks/partners/bi/power-bi Also I don't think Serverless compute for Notebook will work for your connection with Power BI. You might need to setup a Se...

  • 0 kudos
Anonymous
by Not applicable
  • 41965 Views
  • 7 replies
  • 12 kudos

How to connect and extract data from sharepoint using Databricks (AWS) ?

We are using Databricks (on AWS). We need to connect to SharePoint and extract & load data to Databricks Delta table. Any possible solution on this ?

  • 41965 Views
  • 7 replies
  • 12 kudos
Latest Reply
yliu
New Contributor III
  • 12 kudos

Wondering the same.. Can we use Sharepoint REST API to download the file and save to dbfs/external location and read it? 

  • 12 kudos
6 More Replies
Phani1
by Valued Contributor II
  • 270 Views
  • 1 replies
  • 1 kudos

Access the data from cross-cloud.

Hi All ,We have a use case  where we need to connect AWS Databricks to a GCP storage bucket to access the data. In Databricks We're trying to use external locations and storage credentials, but it seems like AWS Databricks only supports AWS storage b...

  • 270 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @Phani1 ,You can use delta sharing. In that way you can create share that will allow you to access data stored in GCS and it's govern by UC permissions model.What is Delta Sharing? | Databricks on AWSYou can also use legacy approach, but it doesn'...

  • 1 kudos
smukhi
by New Contributor II
  • 3905 Views
  • 4 replies
  • 0 kudos

Encountering Error UNITY_CREDENTIAL_SCOPE_MISSING_SCOPE

As of this morning we started receiving the following error message on a Databricks job with a single Pyspark Notebook task. The job has not had any code changes in 2 months. The cluster configuration has also not changed. The last successful run of ...

  • 3905 Views
  • 4 replies
  • 0 kudos
Latest Reply
kasiviss42
New Contributor III
  • 0 kudos

i am also facing the same issue :- [UNITY_CREDENTIAL_SCOPE_MISSING_SCOPE] Missing Credential Scope. Unity Credential Scope id not found in thread locals.Issue occurs:-when we try to list files using dbutils.fs.lsand also this occurs at times when we ...

  • 0 kudos
3 More Replies
svm_varma
by New Contributor II
  • 338 Views
  • 1 replies
  • 2 kudos

Resolved! Azure Databricks quota restrictions on compute in Azure for students subscription

Hi All,Regrading creating clusters in Databricks I'm getting quota error have tried to increase quotas in the region where the resource is hosted still unable to increase the limit, is there any workaround  or could you help select the right cluster ...

svm_varma_1-1735552504129.png svm_varma_0-1735552319815.png svm_varma_2-1735552549290.png
  • 338 Views
  • 1 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 2 kudos

Hi @svm_varma ,You can try to create Standard_DS3_v2 cluster. It has 4 cores and your current subscription limit for given region is 6 cores. The one you're trying to create needs 8 cores and hence you're getting quota exceeded exception.You can also...

  • 2 kudos
boskicl
by New Contributor III
  • 28313 Views
  • 6 replies
  • 10 kudos

Resolved! Table write command stuck "Filtering files for query."

Hello all,Background:I am having an issue today with databricks using pyspark-sql and writing a delta table. The dataframe is made by doing an inner join between two tables and that is the table which I am trying to write to a delta table. The table ...

filtering job_info spill_memory
  • 28313 Views
  • 6 replies
  • 10 kudos
Latest Reply
timo199
New Contributor II
  • 10 kudos

Even if I vacuum and optimize, it keeps getting stuck.cluster type is r6gd.xlarge min:4, max:6driver type is r6gd.2xlarge

  • 10 kudos
5 More Replies
vijaypodili
by New Contributor III
  • 462 Views
  • 9 replies
  • 0 kudos

databricks job taking longer time to load 2.3 gb data from bolb to ssms table

df_CorpBond= spark.read.format("parquet").option("header", "true").load(f"/mnt/{container_name}/raw_data/dsl.corporate.parquet") df_CorpBond.repartition(20).write\ .format("jdbc")\ .option("url", url_connector)\ .option("dbtable", "MarkIt...

Data Engineering
datarbricks
performance
  • 462 Views
  • 9 replies
  • 0 kudos
Latest Reply
vijaypodili
New Contributor III
  • 0 kudos

Hi @RiyazAli this is my dag digramfile size is 3.5 gb and in future we need to load 14gb as well

  • 0 kudos
8 More Replies
singhanuj2803
by New Contributor III
  • 342 Views
  • 1 replies
  • 1 kudos

Apache Spark SQL query to get organization hierarchy

I'm currently diving deep into Spark SQL and its capabilities, and I'm facing an interesting challenge. I'm eager to learn how to write CTE recursive queries in Spark SQL, but after thorough research, it seems that Spark doesn't natively support recu...

rr.png RR1.png
  • 342 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @singhanuj2803, It is correct that Spark SQL does not natively support recursive Common Table Expressions (CTEs). However, there are some workarounds and alternative methods you can use to achieve similar results.   Using DataFrame API with Loops:...

  • 1 kudos
singhanuj2803
by New Contributor III
  • 164 Views
  • 1 replies
  • 1 kudos

How to run stored procedure in Azure Database for PostgreSQL using Azure Databricks Notebook

We have Stored Procedure available in Azure Database for PostgreSQL and we want to call or run or execute the postgreSQL stored procedures in Azure Databricks through NotebookWe are attempting to run PostgreSQL stored procedures, through Azure Databr...

  • 164 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

To execute a PostgreSQL stored procedure from an Azure Databricks notebook, you need to follow these steps: Required Libraries:You need to install the psycopg2 library, which is a PostgreSQL adapter for Python. This can be done using the %pip install...

  • 1 kudos
singhanuj2803
by New Contributor III
  • 360 Views
  • 1 replies
  • 0 kudos

Resolved! How to execute SQL stored procedure in Azure Database for SQL Server using Azure Databricks Notebook

We have Stored Procedure available in Azure Database for SQL Server and we want to call or run or execute the postgreSQL stored procedures in Azure Databricks through NotebookWe are attempting to run SQL stored procedures, through Azure Databricks no...

  • 360 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @singhanuj2803, To execute a SQL stored procedure in Azure Databricks, you can follow these steps: Required Libraries:You need to install the pyodbc library to connect to Azure SQL Database using ODBC. You can install it using the following comman...

  • 0 kudos
h2p5cq8
by New Contributor III
  • 648 Views
  • 3 replies
  • 3 kudos

Resolved! Deleting records from Delta table that are not in relational table

I have a Delta table that I keep in sync with a relational (SQL Server) table. The inserts and updates are easy but checking for records to delete is prohibitively slow. I am querying the relational table for all primary key values and any primary ke...

  • 648 Views
  • 3 replies
  • 3 kudos
Latest Reply
hari-prasad
Valued Contributor II
  • 3 kudos

Let's understand the complexity behind this code when executed on delta table along with Spark.pks = spark.read.format("jdbc").option("query": "SELECT pk FROM sql_table_name").load() delta_table = spark.read.table(delta_table_name) r = target_table.f...

  • 3 kudos
2 More Replies
filipniziol
by Contributor III
  • 829 Views
  • 9 replies
  • 4 kudos

Resolved! Magic Commands (%sql) Not Working with Databricks Extension for VS Code

Hi Community,I’ve encountered an issue with the Databricks Extension for VS Code that seems to contradict the documentation. According to the Databricks documentation, the extension supports magic commands like %sql when used with Databricks Connect:...

filipniziol_0-1734692630751.png
  • 829 Views
  • 9 replies
  • 4 kudos
Latest Reply
jack533
New Contributor III
  • 4 kudos

In reality, there is nothing to do with grpc_wait_for_shutdown_with_timeout. Although we haven't yet implemented a solution, we have an open issue for it, but it shouldn't stop SQL magic from working.Or is the "Interactive" tab where you encounter th...

  • 4 kudos
8 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 1033 Views
  • 7 replies
  • 2 kudos

Databricks Job cluster for continuous run

Hi AllI am having situation where I wanted to run job as continuous trigger by using job cluster, cluster terminating and re-creating in every run within continuous trigger.I just wanted two know if we have any option where I can use same job cluster...

AjayPandey_0-1728973783760.png
  • 1033 Views
  • 7 replies
  • 2 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 2 kudos

@Ajay-Pandey cant we achieve the similar functionalities with the help of cluster Pools , why don't you try cluster pools.

  • 2 kudos
6 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels