Data Engineering

Forum Posts

Sorted by:

by cmilligan • Contributor II

11-03-2022 5:36:40 AM

5227 Views
3 replies
2 kudos

Resolved! Orchestrate run of a folder

I'm needing to run the contents of a folder, which can change over time. Is there a way to set up a notebook that can orchestrate running all notebooks in a folder? My though was if I could retrieve a list of the notebooks I could create a loop to ru...

Data Engineering

5227 Views
3 replies
2 kudos

11-03-2022 5:36:40 AM

View Replies

Latest Reply

Hubert-Dudek
Databricks MVP

11-03-2022 7:26:34 AM

2 kudos

List all notebooks by making API call and then run them by using dbutils.notebook.run:import requests ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext() host_name = ctx.tags().get("browserHostName").get() host_token = ctx.apiToke...

2 kudos

11-03-2022 7:26:34 AM

2 More Replies

by al_joe • Contributor

06-17-2022 6:04:17 AM

8049 Views
5 replies
5 kudos

Resolved! How do I clone a repo in Community Edition?

The e-learning videos on DBacademy say we should click on "Repos" and "Add Repo"I cannot find this in my Community Edition UII am a little frustrated that there are so many different versions of the UI and many videos show UI options that we cannot ...

Data Engineering

8049 Views
5 replies
5 kudos

06-17-2022 6:04:17 AM

View Replies

Latest Reply

Psybelo
New Contributor II

11-03-2022 8:48:28 AM

5 kudos

Hello, just import the .dbc file direct into your user workspace, as explained by Databricks here:https://www.databricks.training/step-by-step/importing-courseware-from-github/The simplest way

5 kudos

11-03-2022 8:48:28 AM

4 More Replies

by Gim • Contributor

11-02-2022 3:30:05 PM

75725 Views
3 replies
9 kudos

Best practice for logging in Databricks notebooks?

What is the best practice for logging in Databricks notebooks? I have a bunch of notebooks that run in parallel through a workflow. I would like to keep track of everything that happens such as errors coming from a stream. I would like these logs to ...

Data Engineering

75725 Views
3 replies
9 kudos

11-02-2022 3:30:05 PM

View Replies

Latest Reply

karthik_p
Databricks Partner

11-03-2022 7:42:35 AM

9 kudos

@Gimwell Young AS @Debayan Mukherjee mentioned if you configure verbose logging in workspace level, logs will be moved to your storage bucket that you have provided during configuration. from there you can pull logs into any of your licensed log mo...

9 kudos

11-03-2022 7:42:35 AM

2 More Replies

by Gopi0403 • Databricks Partner

08-08-2022 3:19:07 AM

5981 Views
7 replies
0 kudos

Issue on Cluster creating new workspace: I Cannot able to create a new workspace in Databricks using Quickstart. When I am creating the workspace I ge...

Issue on Cluster creating new workspace: I Cannot able to create a new workspace in Databricks using Quickstart. When I am creating the workspace I get the Rollback failed error from AWS eventhoughI have given all the required informations. Kindly he...

Data Engineering

5981 Views
7 replies
0 kudos

08-08-2022 3:19:07 AM

View Replies

Latest Reply

Prabakar
Databricks Employee

08-08-2022 3:58:30 AM

0 kudos

hi @Gopichandran N could you please add more information on the issue that you are facing. could you please add the screenshot of the error?

0 kudos

08-08-2022 3:58:30 AM

6 More Replies

by -werners- • Esteemed Contributor III

11-03-2022 7:02:30 AM

3604 Views
2 replies
17 kudos

Autoloader: how to avoid overlap in files

I'm thinking of using autoloader to process files being put on our data lake.Let's say f.e. every 15 minutes, a parquet files is written. These files however contain overlapping data.Now, every 2 hours I want to process the new data (autoloader) and...

Data Engineering

3604 Views
2 replies
17 kudos

11-03-2022 7:02:30 AM

View Replies

Latest Reply

Hubert-Dudek
Databricks MVP

11-03-2022 7:21:29 AM

17 kudos

What about forEachBatch and then MERGE?Alternatively, run another process that will clean updates using the window function, as you said.

17 kudos

11-03-2022 7:21:29 AM

1 More Replies

by Data_Engineer3 • Contributor III

11-01-2022 3:26:39 AM

4855 Views
1 replies
7 kudos

Move folder from dbfs location to user workspace directory in azure databricks

I need to move group of files(python or scala file from)or folder from dbfs location to user workspace directory in azure databricks to do testing on file.Its verify difficult to upload each file one by one into the user workspace directory, so is it...

Data Engineering

4855 Views
1 replies
7 kudos

11-01-2022 3:26:39 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

11-03-2022 7:11:39 AM

7 kudos

dbutils.fs.mv or dbutils.fs.cp can help you.

7 kudos

11-03-2022 7:11:39 AM

by weldermartins • Honored Contributor

10-28-2022 12:40:44 PM

4504 Views
3 replies
13 kudos

Resolved! SCD type 2

Hey guys. I don't know if I'm tired, I ask for your help, but I don't understand where is the difference in the number of fields.Thanks! I'm replicating SCD type 2 based on this documentation:https://docs.delta.io/latest/delta-update.html#slowly-chan...

Data Engineering

4504 Views
3 replies
13 kudos

10-28-2022 12:40:44 PM

View Replies

Latest Reply

weldermartins
Honored Contributor

10-28-2022 7:11:15 PM

13 kudos

@Werner Stinckens ?

13 kudos

10-28-2022 7:11:15 PM

2 More Replies

by Rajesh2747 • Databricks Partner

11-03-2022 5:20:17 AM

8141 Views
0 replies
2 kudos

How to check, who did changes in databrics notebook and what are changes done in databrics notebook.

Data Engineering

8141 Views
0 replies
2 kudos

11-03-2022 5:20:17 AM

by Chris_Konsur • New Contributor III

10-18-2022 7:16:09 AM

3665 Views
2 replies
3 kudos

Resolved! to configure Autoloader in File notification mode to access the Premium BlobStorage

First, I tried to configure Autoloader in File notification mode to access the Premium BlobStorage 'databrickspoc1' (PREMIUM , ADLS Gen2). I get this Error: I get this errorcom.microsoft.azure.storage.StorageException: I checked my storage account->N...

Data Engineering

3665 Views
2 replies
3 kudos

10-18-2022 7:16:09 AM

View Replies

Latest Reply

Hubert-Dudek
Databricks MVP

11-03-2022 2:41:30 AM

3 kudos

When you created a premium account, have you chosen "Premium account type" as "File shares"? It should be "Block blobs".

3 kudos

11-03-2022 2:41:30 AM

1 More Replies

by Priya_Mani • New Contributor II

10-21-2022 6:09:45 AM

2865 Views
3 replies
4 kudos

Databricks Notebook dataframe loading duplicate data in SQL table

Hi, I am trying to load data from datalake into SQL table using "SourceDataFrame.write" operation in a Notebook using apache spark.This seems to be loading duplicates at random times. The logs don't give much information and I am not sure what else t...

Data Engineering

2865 Views
3 replies
4 kudos

10-21-2022 6:09:45 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

10-24-2022 5:22:00 AM

4 kudos

can you elaborate a bit more on this notebook?And also what databricks runtime version?

4 kudos

10-24-2022 5:22:00 AM

2 More Replies

by User16844588229 • Databricks Employee

10-11-2022 10:16:28 AM

14739 Views
9 replies
4 kudos

docs.databricks.com

Navigate and discover content more efficiently with Search in DatabricksHi all- Justin Kim here, I'm the Databricks product manager responsible for content organization and navigation in our product, which includes Search. Great to see you on the Com...

Data Engineering

14739 Views
9 replies
4 kudos

10-11-2022 10:16:28 AM

View Replies

Latest Reply

karthik_p
Databricks Partner

10-18-2022 3:55:30 PM

4 kudos

@Justin Kim Thank you for quick reply, usually Last Modified is Recent changes right (that can be last 24hrs or cap limit that we add), whereas anytime they should show all Notebooks or Tables from start. that is where i got confused

4 kudos

10-18-2022 3:55:30 PM

8 More Replies

by Sandy21 • New Contributor III

11-02-2022 8:47:27 AM

1884 Views
1 replies
3 kudos

Queries with running REST API command in databricks to create a Job

What happens when jobs/create REST API command is run multiple times(say 3 times) with the same JSON configuration? Will 3 jobs are created with the same name or only 1 job will be created?

Data Engineering

1884 Views
1 replies
3 kudos

11-02-2022 8:47:27 AM

View Replies

Latest Reply

Debayan
Databricks Employee

11-02-2022 11:50:05 PM

3 kudos

Hi @Santhosh Raj , logically only one job should be created.

3 kudos

11-02-2022 11:50:05 PM

by Dicer • Valued Contributor

11-02-2022 4:22:37 AM

8471 Views
2 replies
1 kudos

Resolved! PARSE_SYNTAX_ERROR: Syntax error at or near 'VACUUM'

I tried to VACUUM a delta table, but there is a Syntax error.Here is the code:%sql set spark.databricks.delta.retentionDurationCheck.enabled = False VACUUM test_deltatable

Data Engineering

8471 Views
2 replies
1 kudos

11-02-2022 4:22:37 AM

View Replies

Latest Reply

Ravi
Databricks Employee

11-02-2022 4:32:00 AM

1 kudos

@Cheuk Hin Christophe Poon Missing semi-colon at end of line 2?%sql set spark.databricks.delta.retentionDurationCheck.enabled = False; VACUUM test_deltatable

1 kudos

11-02-2022 4:32:00 AM

1 More Replies

by a2_ish • New Contributor II

10-04-2022 3:20:18 AM

3395 Views
2 replies
2 kudos

How to write the delta files for managed table? how can I define the sink

I have tried below code to write data in a delta table and save the delta files in a sink. I tried using azure storage as sink but I get error as not enough access, I can confirm that I have enough access to azure storage, however I can run the below...

Data Engineering

3395 Views
2 replies
2 kudos

10-04-2022 3:20:18 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-02-2022 9:10:27 PM

2 kudos

Hi @Ankit Kumar Does @Hubert Dudek response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

2 kudos

11-02-2022 9:10:27 PM

1 More Replies

by pret • New Contributor II

10-04-2022 2:39:00 AM

5157 Views
4 replies
0 kudos

How can I run a scala command line in databricks?

I wish to run a scala command, which I believe would normally be run from a scala command line rather than from within a notebook. It happens to be:scala [-cp scalatest-<version>.jar:...] org.scalatest.tools.Runner [arguments](scalatest_2.12__3.0.8.j...

Data Engineering

5157 Views
4 replies
0 kudos

10-04-2022 2:39:00 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-02-2022 8:49:01 PM

0 kudos

Hi @David Vardy Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

0 kudos

11-02-2022 8:49:01 PM

3 More Replies

Databricks Community

Forum Posts

Resolved! Orchestrate run of a folder

Resolved! How do I clone a repo in Community Edition?

Best practice for logging in Databricks notebooks?

Issue on Cluster creating new workspace: I Cannot able to create a new workspace in Databricks using Quickstart. When I am creating the workspace I ge...

Autoloader: how to avoid overlap in files

Move folder from dbfs location to user workspace directory in azure databricks

Resolved! SCD type 2

How to check, who did changes in databrics notebook and what are changes done in databrics notebook.

Resolved! to configure Autoloader in File notification mode to access the Premium BlobStorage

Databricks Notebook dataframe loading duplicate data in SQL table

docs.databricks.com

Queries with running REST API command in databricks to create a Job

Resolved! PARSE_SYNTAX_ERROR: Syntax error at or near 'VACUUM'

How to write the delta files for managed table? how can I define the sink

How can I run a scala command line in databricks?

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template