Data Engineering

Forum Posts

Sorted by:

by horatiug • New Contributor III

12-06-2022 9:42:47 AM

1577 Views
5 replies
1 kudos

Databricks workspace with custom VPC using terraform in Google Cloud

I am working on Google Cloud and want to create Databricks workspace with custom VPC using terraform. Is that supported ? If yes is it similar to AWS way ?Thank youHoratiu

Data Engineering

1577 Views
5 replies
1 kudos

12-06-2022 9:42:47 AM

View Replies

Latest Reply

Anonymous
Not applicable

01-28-2023 5:44:08 AM

1 kudos

Hi @horatiu guja GCP Workspace provisioning using Terraform is public preview now. Please refer to the below doc for the steps.https://registry.terraform.io/providers/databricks/databricks/latest/docs/guides/gcp-workspace

1 kudos

01-28-2023 5:44:08 AM

4 More Replies

by johnb1 • New Contributor III

01-27-2023 6:04:04 AM

2841 Views
4 replies
0 kudos

SELECT from table saved under path

Hi!I saved a dataframe as a delta table with the following syntax:(test_df .write .format("delta") .mode("overwrite") .save(output_path) )How can I issue a SELECT statement on the table?What do I need to insert into [table_name] below?SELECT ...

Data Engineering

2841 Views
4 replies
0 kudos

01-27-2023 6:04:04 AM

View Replies

Latest Reply

Ajay-Pandey
Esteemed Contributor III

01-27-2023 8:20:19 AM

0 kudos

Hi @John B there is two way to access your delta table-SELECT * FROM delta.`your_delta_table_path`df.write.format("delta").mode("overwrite").option("path", "your_path").saveAsTable("table_name")Now you can use your select query-SELECT * FROM [table_...

0 kudos

01-27-2023 8:20:19 AM

3 More Replies

by xiaochong • New Contributor III

09-07-2022 2:09:53 PM

516 Views
1 replies
2 kudos

Is Delta Live Tables planned to be open source in the future?

Data Engineering

516 Views
1 replies
2 kudos

09-07-2022 2:09:53 PM

View Replies

Latest Reply

Priyanka_Biswas
Valued Contributor

01-27-2023 2:05:25 PM

2 kudos

Hello there @G Z I would say "we have a history of open sourcing our biggest innovations but there's no concrete timeline for dlt. It's built on the open APIs of spark and delta, so the most important parts (your transformation logic and you data) ...

2 kudos

01-27-2023 2:05:25 PM

by joakon • New Contributor III

01-25-2023 12:25:39 PM

1411 Views
4 replies
3 kudos

Resolved! Databricks - Workflow- Jobs- Script to automate

Hi - I have created a Databricks job - under Workflow - its running fine without any issues . I would like to promote this job to other workspaces using a script.Is there a way to script the job definition and deploy it across multiple workspaces .I ...

Data Engineering

1411 Views
4 replies
3 kudos

01-25-2023 12:25:39 PM

View Replies

Latest Reply

joakon
New Contributor III

01-27-2023 10:04:57 AM

3 kudos

thank you @Landan George

3 kudos

01-27-2023 10:04:57 AM

3 More Replies

by Dbks_Community • New Contributor II

11-30-2022 12:38:17 PM

845 Views
2 replies
0 kudos

Cross region Databricks to SQL Connection

We are trying to connect Azure Databricks Cluster to Azure SQL database but the firewalls at SQL level is causing an issue.Whitelisting dbks subnet is not an option here as both the resources are in two different azure regions. Is there a secure way ...

Data Engineering

845 Views
2 replies
0 kudos

11-30-2022 12:38:17 PM

View Replies

Latest Reply

Cedric
Valued Contributor

01-27-2023 7:11:33 AM

0 kudos

Hi @Timir Ranjan,Have you tried looking into private endpoints? This allows you to expose your Azure SQL database from the Azure backbone and is cross-regional supported.https://learn.microsoft.com/en-us/azure/private-link/private-endpoint-overviewP...

0 kudos

01-27-2023 7:11:33 AM

1 More Replies

by StevenW • New Contributor III

01-26-2023 6:20:27 AM

2488 Views
10 replies
0 kudos

Resolved! Large MERGE Statements - 500+ lines of code!

I'm new to databricks. (Not new to DB's - 10+ year DB Developer).How do you generate a MERGE statement in DataBricks? Trying to manually maintain a 500+ or 1000+ lines in a MERGE statement doesn't make much sense? Working with Large Tables of between...

Data Engineering

2488 Views
10 replies
0 kudos

01-26-2023 6:20:27 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

01-26-2023 8:40:17 AM

0 kudos

In my opinion, when possible MERGE statement should be on the primary key. If not possible you can create your own unique key (by concatenate some fields and eventually hashing them) and then use it in merge logic.

0 kudos

01-26-2023 8:40:17 AM

9 More Replies

by KVNARK • Honored Contributor II

01-27-2023 12:56:19 AM

1407 Views
5 replies
7 kudos

Resolved! SQL error while executing

any fixes to the error would be much appreciated

Data Engineering

1407 Views
5 replies
7 kudos

01-27-2023 12:56:19 AM

View Replies

Latest Reply

Ajay-Pandey
Esteemed Contributor III

01-27-2023 5:00:42 AM

7 kudos

Hi @KVNARK . Could you please send the query that you are executing, that will help me to debug the error.

7 kudos

01-27-2023 5:00:42 AM

4 More Replies

by stephansmit • New Contributor III

10-06-2022 5:02:39 AM

1997 Views
3 replies
6 kudos

Why is my lineage extraction not showing up in the Unity Catalog

Im trying to get the lineage graph to work in Unity catalog, however nothing seems to appear even though I followed the docs. I did the following steps1. Created a Unity metastore and attached the workspace to that metastore.2. Created a Single user ...

Data Engineering

1997 Views
3 replies
6 kudos

10-06-2022 5:02:39 AM

View Replies

Latest Reply

L_Favre
New Contributor II

01-27-2023 4:29:31 AM

6 kudos

@Stephan Smit We finally got a solution from level 3 support (Databricks support).You may check your firewall logs.On our side, we had to open communication to "Event Hub endpoint".The destination depends on your workspace region: Azure Databricks r...

6 kudos

01-27-2023 4:29:31 AM

2 More Replies

by Anonymous • Not applicable

05-27-2021 11:46:54 AM

584 Views
1 replies
0 kudos

Monitoring

Are there any event streams that are or could be exposed in AWS (such as Cloudwatch Eventbridge events or SNS messages? In particular I'm interested in events that detail jobs being run. The use case here would be for monitoring jobs from our web app...

Data Engineering

584 Views
1 replies
0 kudos

05-27-2021 11:46:54 AM

View Replies

Latest Reply

jessykoo32
New Contributor II

01-27-2023 3:19:24 AM

0 kudos

Yes, there are several event streams in AWS that can be used to monitor jobs being run. Your Texas BenefitsCloudWatch Events: This service allows you to set up rules to automatically trigger actions in response to specific events in other AWS service...

0 kudos

01-27-2023 3:19:24 AM

by Johan_Van_Noten • New Contributor III

12-06-2021 4:38:56 AM

6636 Views
19 replies
10 kudos

Resolved! Correlated column exception in SQL UDF when using UDF parameters.

EnvironmentAzure Databricks 10.1, including Spark 3.2.0ScenarioI want to retrieve the average of a series of values between two timestamps, using a SQL UDF.The average is obviously just an example. In a real scenario, I would like to hide some additi...

Data Engineering

6636 Views
19 replies
10 kudos

12-06-2021 4:38:56 AM

View Replies

Latest Reply

creastysomp
New Contributor II

01-27-2023 2:52:23 AM

10 kudos

Thanks for your suggestion. The fact that I want to do this in SparkSQL is because there is no underlying SQLServer.

10 kudos

01-27-2023 2:52:23 AM

18 More Replies

by maaaxx • New Contributor III

01-17-2023 6:42:45 AM

2110 Views
5 replies
0 kudos

Resolved! Can Unity catalog grant the access to a file inside azure datalake storage?

Hi databricks community,I have searched quite a while through the internet but did not find an answer. If I have configured the azure datalake connection in Unity data catalog, is it possible to grant the access to users for a specific file or a fold...

Data Engineering

2110 Views
5 replies
0 kudos

01-17-2023 6:42:45 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

01-17-2023 12:31:18 PM

0 kudos

As @werners said service principal needs to have access to the file level.In the unity catalog, you can use "READ FILES"/"WRITE FILES" permission to give someone the possibility of reading files from the storage level (but through databricks).

0 kudos

01-17-2023 12:31:18 PM

4 More Replies

by arsamkull • New Contributor III

08-23-2022 11:47:54 PM

2645 Views
5 replies
6 kudos

Usage of Azure DevOps System.AccessToken as PAT in Databricks

Hi there! I'm trying to use Azure DevOps Pipeline to automate Azure Databricks Repos API. Im using the following workflow:Get an Access Token for a Databricks Service Principal using a Certificate (which works great)Usage REST Api to generate Git Cre...

Data Engineering

2645 Views
5 replies
6 kudos

08-23-2022 11:47:54 PM

View Replies

Latest Reply

Srihasa_Akepati
New Contributor III

01-27-2023 12:07:40 AM

6 kudos

@Adrian Ehrsam The PAT limit has been increased to 2048 now. Please check.

6 kudos

01-27-2023 12:07:40 AM

4 More Replies

by mimezzz • Contributor

11-02-2022 6:46:44 PM

2866 Views
8 replies
10 kudos

Resolved! Dataframe rows missing after write_to_delta and read_from_delta

Hi, i am trying to load mongo into s3 using pyspark 3.1.1 by reading them into a parquet. My code snippets are like:df = spark \ .read \ .format("mongo") \ .options(**read_options) \ .load(schema=schema)df = df.coalesce(64)write_df_to_del...

Data Engineering

2866 Views
8 replies
10 kudos

11-02-2022 6:46:44 PM

View Replies

Latest Reply

mimezzz
Contributor

01-26-2023 9:45:26 PM

10 kudos

So i think i have solved the mystery here it was to do with the retention config. By setting the retentionEnabled to True and rention hours being 0, we somewhat loses a few rows in the first file as they were mistaken as files from last session and ...

10 kudos

01-26-2023 9:45:26 PM

7 More Replies

by prem0305 • New Contributor

01-26-2023 9:46:07 AM

346 Views
1 replies
0 kudos

I am not able to login with my credentials. This is happening with me again and again.i have created different account then also,i am facing the same ...

I am not able to login with my credentials. This is happening with me again and again.i have created different account then also,i am facing the same proble..please help me to resolve this issue...i am a new learner here

Data Engineering

346 Views
1 replies
0 kudos

01-26-2023 9:46:07 AM

View Replies

Latest Reply

Chaitanya_Raju
Honored Contributor

01-26-2023 7:10:01 PM

0 kudos

Hi @PREM RANJAN It might be a temporary issue, for any issue with Academy learning/certifications, you can raise a ticket in the below link, sharing it with you for your future reference as well.https://help.databricks.com/s/contact-us?ReqType=train...

0 kudos

01-26-2023 7:10:01 PM

by ivanychev • Contributor

01-26-2023 12:05:32 AM

730 Views
2 replies
0 kudos

Resolved! When Databricks on AWS will support c6i/m6i/r6i EC2 instance types?

The instances are almost 1.5 years old now and provide better efficiency that the 5 gen.

Data Engineering

730 Views
2 replies
0 kudos

01-26-2023 12:05:32 AM

View Replies

Latest Reply

LandanG
Honored Contributor

01-26-2023 6:17:16 AM

0 kudos

@Sergey Ivanychev those instance types are under development and should be GA very soon. No official date AFAIK

0 kudos

01-26-2023 6:17:16 AM

1 More Replies

User

Count

1601

736

343

284

246

Databricks

Forum Posts

Databricks workspace with custom VPC using terraform in Google Cloud

SELECT from table saved under path

Is Delta Live Tables planned to be open source in the future?

Resolved! Databricks - Workflow- Jobs- Script to automate

Cross region Databricks to SQL Connection

Resolved! Large MERGE Statements - 500+ lines of code!

Resolved! SQL error while executing

Why is my lineage extraction not showing up in the Unity Catalog

Monitoring

Resolved! Correlated column exception in SQL UDF when using UDF parameters.

Resolved! Can Unity catalog grant the access to a file inside azure datalake storage?

Usage of Azure DevOps System.AccessToken as PAT in Databricks

Resolved! Dataframe rows missing after write_to_delta and read_from_delta

I am not able to login with my credentials. This is happening with me again and again.i have created different account then also,i am facing the same ...

Resolved! When Databricks on AWS will support c6i/m6i/r6i EC2 instance types?

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...