Data Engineering

Forum Posts

Sorted by:

by jonathan-dufaul • Valued Contributor

01-06-2023 1:48:40 PM

4148 Views
6 replies
6 kudos

Why is writing to MSSQL Server 12.0 so slow directly from spark but nearly instant when I write to a csv and read it back

I have a dataframe that inexplicably takes forever to write to an MS SQL Server even though other dataframes, even much larger ones, write nearly instantly. I'm using this code:my_dataframe.write.format("jdbc") .option("url",sqlsUrl) .optio...

Data Engineering

4148 Views
6 replies
6 kudos

01-06-2023 1:48:40 PM

View Replies

Latest Reply

plondon
New Contributor II

07-24-2024 4:00:47 AM

6 kudos

Had a similar issue. I can do 1-4 million rows in 1 minute via SSIS ETL on SQL server. Table is 15 fields long. Looking at your code it seems you have many fields but nothing like 300-400 fields which can affect performance. You can check SQL Server ...

6 kudos

07-24-2024 4:00:47 AM

5 More Replies

by Arnold_Souza • New Contributor III

03-22-2023 2:56:49 PM

8527 Views
4 replies
1 kudos

Connect Databricks to a database protected by a firewall

We a facing a situation and I would like to understand from the Databricks side what is the best practice regarding that. Question: Is it possible to have a cluster with a fixed Global IP on Databricks?DetailsWe have a vendor that has a SQL Server da...

Data Engineering

8527 Views
4 replies
1 kudos

03-22-2023 2:56:49 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-01-2023 10:18:01 AM

1 kudos

@Arnold Souza If you file a support to Azure support they can help customize the Vnet by unlocking it as the Azure Databricks resources are deployed in a managed resource group. Your plan B also should be the way to go if option 1 does not work as e...

1 kudos

04-01-2023 10:18:01 AM

3 More Replies

by Anonymous • Not applicable

11-28-2022 6:54:42 PM

1701 Views
0 replies
0 kudos

The CDC Logs from AWS DMS not apply correctly

I have a dms task that processing the full-load and replication ongoing tasksfrom source (MSSQL) to target (AWS S3)then use delta lake to handle the CDC logsI've a notebook that would insert data into mssql continuously (with id as primary key)then d...

204293406-01bf6cc1-bb6f-42bb-9bfe-e9b1f5135ae9[1]

Data Engineering

1701 Views
0 replies
0 kudos

11-28-2022 6:54:42 PM

by Carlton • Contributor

10-13-2022 9:00:08 AM

5716 Views
8 replies
1 kudos

Resolved! How to Use the CharIndex with Databricks SQL

When applying the following T-SQL I don't get any errors on MS SQL ServerSELECT DISTINCT * FROM dbo.account LEFT OUTER JOIN dbo.crm2cburl_lookup ON account.Id = CRM2CBURL_Lookup.[Key] LEFT OUTER JOIN dbo.organizations ON CRM2CBURL_Lookup.CB_UR...

Data Engineering

5716 Views
8 replies
1 kudos

10-13-2022 9:00:08 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

10-14-2022 1:14:41 AM

1 kudos

cross apply is not a function in databricks sql.

1 kudos

10-14-2022 1:14:41 AM

7 More Replies

Databricks Community

Why is writing to MSSQL Server 12.0 so slow directly from spark but nearly instant when I write to a csv and read it back

Connect Databricks to a database protected by a firewall

The CDC Logs from AWS DMS not apply correctly

Resolved! How to Use the CharIndex with Databricks SQL