Data Engineering

Forum Posts

Sorted by:

Start a conversation

by brickster_2018 • Databricks Employee

06-25-2021 3:33:57 PM

5255 Views
1 replies
0 kudos

What is the advantage of using the new Global init script compared to the legacy init script

Data Engineering

5255 Views
1 replies
0 kudos

06-25-2021 3:33:57 PM

View Replies

Latest Reply

aladda
Databricks Employee

06-25-2021 3:57:44 PM

0 kudos

Global: run on every cluster in the workspace. They can help you to enforce consistent cluster configurations across your workspace. Use them carefully because they can cause unanticipated impacts, like library conflicts. Only admin users can create ...

0 kudos

06-25-2021 3:57:44 PM

by brickster_2018 • Databricks Employee

06-25-2021 3:57:12 PM

2504 Views
0 replies
1 kudos

Why do I see HangingTaskDetector message in my executor logs

Data Engineering

2504 Views
0 replies
1 kudos

06-25-2021 3:57:12 PM

by MoJaMa • Databricks Employee

06-25-2021 3:56:09 PM

2383 Views
1 replies
0 kudos

Does Databricks support VPC migration for an E2 workspace for both Customer Managed VPC and Databricks Managed VPC?

Data Engineering

2383 Views
1 replies
0 kudos

06-25-2021 3:56:09 PM

View Replies

Latest Reply

MoJaMa
Databricks Employee

06-25-2021 3:56:59 PM

0 kudos

Yes.Currently supported paths are:BYO VPC -> New BYO VPCDatabricks-created VPC -> New Databricks-created VPC

0 kudos

06-25-2021 3:56:59 PM

by brickster_2018 • Databricks Employee

06-25-2021 3:48:30 PM

2231 Views
1 replies
0 kudos

Can I use a custom log4j file for the Databricks cluster

Data Engineering

2231 Views
1 replies
0 kudos

06-25-2021 3:48:30 PM

View Replies

Latest Reply

aladda
Databricks Employee

06-25-2021 3:56:32 PM

0 kudos

See this knowledge base article for details - https://kb.databricks.com/clusters/overwrite-log4j-logs.html

0 kudos

06-25-2021 3:56:32 PM

by brickster_2018 • Databricks Employee

06-25-2021 3:50:46 PM

1509 Views
1 replies
1 kudos

Can I run the Python2 Pyspark application in my Databricks GCP workspace

Data Engineering

1509 Views
1 replies
1 kudos

06-25-2021 3:50:46 PM

View Replies

Latest Reply

aladda
Databricks Employee

06-25-2021 3:55:46 PM

1 kudos

For Databricks Runtime 5.5 LTS, Spark jobs, Python notebook cells, and library installation all support both Python 2 and 3.The default Python version for clusters created using the UI is Python 3. In Databricks Runtime 5.5 LTS the default version fo...

1 kudos

06-25-2021 3:55:46 PM

by brickster_2018 • Databricks Employee

06-25-2021 3:54:52 PM

1126 Views
0 replies
0 kudos

What are real-time audit logs? How is it different from the usual Audit logs?

Data Engineering

1126 Views
0 replies
0 kudos

06-25-2021 3:54:52 PM

by MoJaMa • Databricks Employee

06-25-2021 3:54:16 PM

2015 Views
0 replies
0 kudos

Regarding DBFS encryption using CMK: Is there a way to patch existing workspaces so DBFS can be encrypted using a customer managed key? or is this only for new workspaces?

Data Engineering

2015 Views
0 replies
0 kudos

06-25-2021 3:54:16 PM

by brickster_2018 • Databricks Employee

06-25-2021 3:54:10 PM

2207 Views
0 replies
0 kudos

What are Closure and Serialization in Spark? How to avoid TaskNotSerailizable error

Data Engineering

2207 Views
0 replies
0 kudos

06-25-2021 3:54:10 PM

by brickster_2018 • Databricks Employee

06-25-2021 3:53:26 PM

1425 Views
0 replies
0 kudos

Is there a limit on the output generated in the stdout from a job cluster

Data Engineering

1425 Views
0 replies
0 kudos

06-25-2021 3:53:26 PM

by Satyadeepak • Databricks Employee

06-24-2021 7:21:43 AM

1964 Views
1 replies
1 kudos

In Databricks UI /Workspace and /Repos are in same level but while reading a CSV file in Repos Notebooks why do we need to give the path as /Workspace/Repos...?

Data Engineering

1964 Views
1 replies
1 kudos

06-24-2021 7:21:43 AM

View Replies

Latest Reply

aladda
Databricks Employee

06-25-2021 3:52:43 PM

1 kudos

Can you provide an example of what exactly do you mean? If the reference is to how "Repos" shows up in the UI, that's more for a Ux convenience. Repos as such are designed to be a container for version controlled notebooks that live in the Git reposi...

1 kudos

06-25-2021 3:52:43 PM

by brickster_2018 • Databricks Employee

06-25-2021 3:51:24 PM

1594 Views
0 replies
0 kudos

Is it safe to set ignoreMissingFiles to true on a Streaming workload

Data Engineering

1594 Views
0 replies
0 kudos

06-25-2021 3:51:24 PM

by User16790091296 • Databricks Employee

06-25-2021 3:38:16 PM

3833 Views
2 replies
0 kudos

What is the difference between delta lake (Databricks) and Delta Lake (Open Source -Maven) ?

Data Engineering

3833 Views
2 replies
0 kudos

06-25-2021 3:38:16 PM

View Replies

Latest Reply

aladda
Databricks Employee

06-25-2021 3:51:16 PM

0 kudos

Delta Lake on Databricks has added runtime optimizations of the Delta Engine that further enhance the performance and scale of the open source Delta Format. In additional you also get access to a whole host of capabilities available on the Databricks...

0 kudos

06-25-2021 3:51:16 PM

1 More Replies

by brickster_2018 • Databricks Employee

06-25-2021 3:48:52 PM

1453 Views
0 replies
0 kudos

Can I migrate the internal metastore to an external one

Data Engineering

1453 Views
0 replies
0 kudos

06-25-2021 3:48:52 PM

by MoJaMa • Databricks Employee

06-25-2021 3:46:57 PM

5202 Views
1 replies
1 kudos

Can ADF pass values/info back to ADF from the Notebook?

Data Engineering

5202 Views
1 replies
1 kudos

06-25-2021 3:46:57 PM

View Replies

Latest Reply

MoJaMa
Databricks Employee

06-25-2021 3:47:54 PM

1 kudos

Yes, you can pass parameters from ADF —> Azure Databricks.https://docs.microsoft.com/en-us/azure/data-factory/solution-template-databricks-notebook#how-to-use-this-templateYou can also pass values back from the Notebook --> ADF via the dbutils.notebo...

1 kudos

06-25-2021 3:47:54 PM

by brickster_2018 • Databricks Employee

06-25-2021 3:43:53 PM

2223 Views
1 replies
0 kudos

Maven library conflicts

The import statements work fine with maven libraries freshly installed on a running cluster and on restart of the same cluster, it fails to import the classes.

Data Engineering

2223 Views
1 replies
0 kudos

06-25-2021 3:43:53 PM

View Replies

Latest Reply

brickster_2018
Databricks Employee

06-25-2021 3:44:40 PM

0 kudos

When you install maven libraries one by one in running clusters, it is resolved and gets downloaded individually. The behavior would be different when you restart the clusters since the libraries are resolved together and can run into issues due to c...

0 kudos

06-25-2021 3:44:40 PM

Databricks Community

Forum Posts

What is the advantage of using the new Global init script compared to the legacy init script

Why do I see HangingTaskDetector message in my executor logs

Does Databricks support VPC migration for an E2 workspace for both Customer Managed VPC and Databricks Managed VPC?

Can I use a custom log4j file for the Databricks cluster

Can I run the Python2 Pyspark application in my Databricks GCP workspace

What are real-time audit logs? How is it different from the usual Audit logs?

Regarding DBFS encryption using CMK: Is there a way to patch existing workspaces so DBFS can be encrypted using a customer managed key? or is this only for new workspaces?

What are Closure and Serialization in Spark? How to avoid TaskNotSerailizable error

Is there a limit on the output generated in the stdout from a job cluster

In Databricks UI /Workspace and /Repos are in same level but while reading a CSV file in Repos Notebooks why do we need to give the path as /Workspace/Repos...?

Is it safe to set ignoreMissingFiles to true on a Streaming workload

What is the difference between delta lake (Databricks) and Delta Lake (Open Source -Maven) ?

Can I migrate the internal metastore to an external one

Can ADF pass values/info back to ADF from the Notebook?

Maven library conflicts

Genie code Customization

json file existing in volume but not showing in UI

Soumitra dutta : What are the essential concepts a...

Recurring Historical Data Modeling Patterns

How execute SET spark.sql.sources.partitionOverwri...