Data Engineering

Forum Posts

Sorted by:

by rocky5 • New Contributor III

12-26-2022 1:44:13 AM

5109 Views
2 replies
2 kudos

DLT UDF and c#

Hello, can I create spark function in .net and use it in DLT table? I would like to encrypt some data, in documentation scala code is being used as an example, but would it be possible to write decryption/encryption function using C# and use it withi...

Data Engineering

5109 Views
2 replies
2 kudos

12-26-2022 1:44:13 AM

View Replies

Latest Reply

Meghala
Valued Contributor II

12-26-2022 3:04:25 AM

2 kudos

It's not possible. SQL Server 2008 contains SQL CLR runtime that runs .NET languages.

2 kudos

12-26-2022 3:04:25 AM

1 More Replies

by Aravind_P04 • New Contributor II

12-15-2022 9:10:11 AM

5326 Views
3 replies
4 kudos

Clarification on merging multiple notebooks and other

1. Do we have any feature like merge the cells from one or more notebooks into other notebook.2. Do we have any feature like multiple cells from excel is copied it into multiple cells in a notebook . Generally all excel data is copied it into one cel...

Data Engineering

5326 Views
3 replies
4 kudos

12-15-2022 9:10:11 AM

View Replies

Latest Reply

youssefmrini
Databricks Employee

12-21-2022 1:30:44 AM

4 kudos

1) We can't merge cells right now2)We don't have this feature as well3) We don't have multiple editing right now4)You will know only if you face an error. A Notification will pop up5)You can"t keep running the execution because the cells can be linke...

4 kudos

12-21-2022 1:30:44 AM

2 More Replies

by Aviral-Bhardwaj • Esteemed Contributor III

12-23-2022 5:05:04 AM

4371 Views
6 replies
30 kudos

DLT PipeLine Understanding

Hey, guys, I hope you are doing very well today I was going through some databricks documentation and I found dlt documentation but when I am trying to implement it, it is not working very well can anyone can share with me whole code step by step and...

Data Engineering

4371 Views
6 replies
30 kudos

12-23-2022 5:05:04 AM

View Replies

Latest Reply

Meghala
Valued Contributor II

12-26-2022 2:40:37 AM

30 kudos

even Im also going through some databricks documentation

30 kudos

12-26-2022 2:40:37 AM

5 More Replies

by db-avengers2rul • Contributor II

12-20-2022 9:20:44 AM

7752 Views
6 replies
2 kudos

Resolved! Notebooks cells limit

Dear Team,is there a limit in notebook cells in a single notebook in community edition ?

Data Engineering

7752 Views
6 replies
2 kudos

12-20-2022 9:20:44 AM

View Replies

Latest Reply

Meghala
Valued Contributor II

12-26-2022 3:15:26 AM

2 kudos

there no limit on the number of cells

2 kudos

12-26-2022 3:15:26 AM

5 More Replies

by hello_world • Databricks Partner

12-24-2022 6:35:33 PM

5758 Views
3 replies
2 kudos

What exact difference does Auto Loader make?

New to Databricks and here is one thing that confuses me.Since Spark Streaming is already capable of incremental loading by checkpointing. What difference does it make by enabling Auto Loader?

Data Engineering

5758 Views
3 replies
2 kudos

12-24-2022 6:35:33 PM

View Replies

Latest Reply

Meghala
Valued Contributor II

12-26-2022 2:55:26 AM

2 kudos

Auto Loader provides a Structured Streaming source called cloudFiles. Given an input directory path on the cloud file storage, the cloudFiles source automatically processes new files as they arrive, with the option of also processing existing files i...

2 kudos

12-26-2022 2:55:26 AM

2 More Replies

by KuldeepChitraka • New Contributor III

12-25-2022 3:03:40 AM

12147 Views
4 replies
6 kudos

Error handling/exception handling in NOtebook

What is a common practice to to write notebook which includes error handling/exception handling.Is there any example which depicts how notebook should be written to include error handling etc.

Data Engineering

12147 Views
4 replies
6 kudos

12-25-2022 3:03:40 AM

View Replies

Latest Reply

Meghala
Valued Contributor II

12-26-2022 2:53:35 AM

6 kudos

runtime looks for handlers (try-catch) that are registered to handle such exceptions

6 kudos

12-26-2022 2:53:35 AM

3 More Replies

by Aviral-Bhardwaj • Esteemed Contributor III

12-24-2022 6:53:48 AM

15827 Views
3 replies
25 kudos

Understanding Joins in PySpark/Databricks In PySpark, a `join` operation combines rows from two or more datasets based on a common key. It allows you ...

Understanding Joins in PySpark/DatabricksIn PySpark, a `join` operation combines rows from two or more datasets based on a common key. It allows you to merge data from different sources into a single dataset and potentially perform transformations on...

Data Engineering

15827 Views
3 replies
25 kudos

12-24-2022 6:53:48 AM

View Replies

Latest Reply

Meghala
Valued Contributor II

12-26-2022 2:13:31 AM

25 kudos

very informative

25 kudos

12-26-2022 2:13:31 AM

2 More Replies

by SaraGHn • New Contributor III

11-22-2022 7:46:25 AM

2093 Views
1 replies
4 kudos

Error for sparkdl.xgboost import XgboostRegressor

I get the error :cannot import name 'resnet50' from 'keras.applications' (/local_disk0/.ephemeral_nfs/envs/pythonEnv-a3e7b0cc-064d-4585-abfd-6473ed1c1a5b/lib/python3.8/site-packages/keras/applications/__init__.py) It looks like the Keras.applications...

Data Engineering

2093 Views
1 replies
4 kudos

11-22-2022 7:46:25 AM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-24-2022 6:43:49 AM

4 kudos

try to install these libraries via init script some time this happen due to spark version in databricks , libraries can make conflict with Runtime version

4 kudos

12-24-2022 6:43:49 AM

by georgian2133 • New Contributor

12-24-2022 2:29:54 AM

2849 Views
0 replies
0 kudos

Getting error [DATATYPE_MISMATCH.BINARY_OP_DIFF_TYPES]

[DATATYPE_MISMATCH.BINARY_OP_DIFF_TYPES] Cannot resolve "(DocDate AND orderedhl)" due to data type mismatch: the left and right operands of the binary operator have incompatible types ("STRING" and "DECIMAL(38,6)").; line 67, pos 066. group by 67. or...

Data Engineering

2849 Views
0 replies
0 kudos

12-24-2022 2:29:54 AM

by joakon • New Contributor III

12-16-2022 1:10:40 PM

4534 Views
5 replies
1 kudos

Resolved! slow running query

Hi All, I would you to get some ideas on how to improve performance on a data frame with around 10M rows. adls- gen2df1 =source1 , format , parquet ( 10 m)df2 =source2 , format , parquet ( 10 m)df = join df1 and df2 type =inner join df.count() is ...

Data Engineering

4534 Views
5 replies
1 kudos

12-16-2022 1:10:40 PM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-23-2022 8:37:33 PM

1 kudos

hey @raghu maremanda did you get any answer if yes ,please update here, by that other people can also get the solution

1 kudos

12-23-2022 8:37:33 PM

4 More Replies

by test_user • New Contributor II

12-23-2022 6:07:04 AM

43492 Views
3 replies
1 kudos

How to explode an array column and repack the distinct values into one array in DB SQL?

Hi, I am new to DB SQL. I have a table where the array column (cities) contains multiple arrays and some have multiple duplicate values. I need to unpack the array values into rows so I can list the distinct values. The following query works for this...

Data Engineering

43492 Views
3 replies
1 kudos

12-23-2022 6:07:04 AM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-23-2022 8:35:31 PM

1 kudos

try to use SQL windows functions here

1 kudos

12-23-2022 8:35:31 PM

2 More Replies

by Aviral-Bhardwaj • Esteemed Contributor III

12-22-2022 11:35:18 PM

10745 Views
6 replies
33 kudos

Resolved! Timezone understanding

Today I was working in Timezone kind of data but my Singapore user want to see their time in the Data and USA user want to see their time in the datainstead of both, we all are getting UTC time,how to solve this issuePlease guide Data can be anything...

Data Engineering

10745 Views
6 replies
33 kudos

12-22-2022 11:35:18 PM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-23-2022 4:54:48 AM

33 kudos

I got it guys it was happening due to a library conflict now your answers are really helpful I tried all things

33 kudos

12-23-2022 4:54:48 AM

5 More Replies

by Ruby8376 • Valued Contributor

12-22-2022 7:58:46 AM

3915 Views
5 replies
1 kudos

Resolved! Databricks authentication

Hi there!!we are planning to use databricks -tableau on prem integration for reporting. Data would reside in delta lake and using ta leau-databricks connector, user would be able to generate reports from that data .question is: a private end point wi...

Data Engineering

3915 Views
5 replies
1 kudos

12-22-2022 7:58:46 AM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-22-2022 7:36:13 PM

1 kudos

and make sure that you are going with SPARK SQL connection , else it will always fail

1 kudos

12-22-2022 7:36:13 PM

4 More Replies

by Sharmila04 • New Contributor

12-20-2022 7:15:50 PM

5086 Views
3 replies
0 kudos

DBFS File Browser Error RESOURCE_DOES_NOT_EXIST:

Hi,I am new to databricks, and was trying to follow some tutorial to upload a file and move it under some different folder. I used DBFS option.While trying to move/rename the file I am getting below error, can you please help to understand why I am g...

Data Engineering

5086 Views
3 replies
0 kudos

12-20-2022 7:15:50 PM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-20-2022 8:55:45 PM

0 kudos

use these three commands and it will workdbutils.fs.ls('dbfs:/FileStore/vehicle_data.csv')dbutils.fs.ls('/dbfs/FileStore/vehicle_data.csv')dbutils.fs.ls('/dbfs/dbfs/FileStore/vehicle_data.csv')ThanksAviral

0 kudos

12-20-2022 8:55:45 PM

2 More Replies

by pashashiz • New Contributor III

10-16-2022 11:51:27 AM

3179 Views
3 replies
10 kudos

Does Databricks plan to release runtime with Scala 2.13 support?

Data Engineering

3179 Views
3 replies
10 kudos

10-16-2022 11:51:27 AM

View Replies

Latest Reply

pashashiz
New Contributor III

12-22-2022 10:53:52 PM

10 kudos

Hi, @Vidula Khanna, the new version of Databricks still has only 2.12 scala support.

10 kudos

12-22-2022 10:53:52 PM

2 More Replies

Databricks Community

Forum Posts

DLT UDF and c#

Clarification on merging multiple notebooks and other

DLT PipeLine Understanding

Resolved! Notebooks cells limit

What exact difference does Auto Loader make?

Error handling/exception handling in NOtebook

Understanding Joins in PySpark/Databricks In PySpark, a `join` operation combines rows from two or more datasets based on a common key. It allows you ...

Error for sparkdl.xgboost import XgboostRegressor

Getting error [DATATYPE_MISMATCH.BINARY_OP_DIFF_TYPES]

Resolved! slow running query

How to explode an array column and repack the distinct values into one array in DB SQL?

Resolved! Timezone understanding

Resolved! Databricks authentication

DBFS File Browser Error RESOURCE_DOES_NOT_EXIST:

Does Databricks plan to release runtime with Scala 2.13 support?

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template