Data Engineering

Forum Posts

Sorted by:

by Arunsundar • New Contributor III

03-12-2023 9:16:49 PM

4448 Views
4 replies
4 kudos

The possibility of finding the workload dynamically and spin up the cluster based on the workload

Hi Team,Good morning. I would like to understand if there is a possibility to determine the workload automatically through code (data load from a file to a table, determine the file size, kind of a benchmark that we can check), based on which we can ...

Data Engineering

4448 Views
4 replies
4 kudos

03-12-2023 9:16:49 PM

View Replies

Latest Reply

pvignesh92
Honored Contributor

03-13-2023 10:40:13 AM

4 kudos

Hi @Arunsundar Muthumanickam , When you say workload, I believe you might be handling various volumes of data between Dev and Prod environment. If you are using Databricks cluster and do not have much idea on how the volumes might turn out in differ...

4 kudos

03-13-2023 10:40:13 AM

3 More Replies

by AK032716 • New Contributor

01-04-2023 7:45:09 AM

4259 Views
2 replies
2 kudos

implement autoloader to ingest data into delta lake, i have 100 different tables with full load , append merge senarios

i want to implement autoloader to ingest data into delta lake from 5 different source systems and i have 100 different tables in each database how do we dynamically address this by using autoloader , trigger once option - full load , append merge sen...

Data Engineering

4259 Views
2 replies
2 kudos

01-04-2023 7:45:09 AM

View Replies

Latest Reply

daniel_sahal
Databricks MVP

01-04-2023 11:42:44 PM

2 kudos

You can create a generic notebook that will be parametrized with the table name/source system and then just simply trigger notebook with different parameters (for each table/source system).For parametrization you can use dbutils.widgets (https://docs...

2 kudos

01-04-2023 11:42:44 PM

1 More Replies

by Siddhesh2525 • New Contributor III

12-01-2021 3:16:56 AM

6875 Views
2 replies
6 kudos

How to pass dynamic value in databricks

I have separate column value defined in 13 diffrent notebook and i want merge into 1 databrick notebook and want to pass dynamic parameter using databrick so it will help me to run in single databricks notebook .

Data Engineering

6875 Views
2 replies
6 kudos

12-01-2021 3:16:56 AM

View Replies

Latest Reply

Prabakar
Databricks Employee

12-02-2021 7:06:37 AM

6 kudos

Hi @siddhesh Bhavar you can use widgets with the %run command to achieve this. https://docs.databricks.com/notebooks/widgets.html#use-widgets-with-run%run /path/to/notebook $X="10" $Y="1"

6 kudos

12-02-2021 7:06:37 AM

1 More Replies

by max522over • New Contributor II

06-09-2016 1:22:08 PM

18630 Views
3 replies
0 kudos

Resolved! I've set the partition mode to nonstrict in hive but spark is not seeing it

I've got a table I want to add some data to and it's partitoned. I want to use dynamic partitioning but I get this error org.apache.spark.SparkException: Dynamic partition strict mode requires at least one static partition column. To turn this off ...

Data Engineering

18630 Views
3 replies
0 kudos

06-09-2016 1:22:08 PM

View Replies

Latest Reply

max522over
New Contributor II

06-13-2016 3:53:56 PM

0 kudos

I got it working. This was exactly what I needed. Thank you @Peyman Mohajerian

0 kudos

06-13-2016 3:53:56 PM

2 More Replies

Databricks Community

The possibility of finding the workload dynamically and spin up the cluster based on the workload

implement autoloader to ingest data into delta lake, i have 100 different tables with full load , append merge senarios

How to pass dynamic value in databricks

Resolved! I've set the partition mode to nonstrict in hive but spark is not seeing it