cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

User16826987838
by Databricks Employee
  • 2334 Views
  • 1 replies
  • 0 kudos
  • 2334 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

Using cluster tags we can get the cluster name spark.conf.get("spark.databricks.clusterUsageTags.clusterName")

  • 0 kudos
brickster_2018
by Databricks Employee
  • 1318 Views
  • 1 replies
  • 0 kudos

Resolved! Best practices for DStream application in Databricks

I do not see any best practice guide for the DStream application in Databricks docs. Any reference

  • 1318 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

Dstream is unsupported by Databricks. Databrcks strongly recommend migrating the Dstream applications to use Structured Streaminghttps://kb.databricks.com/streaming/dstream-not-supported.html

  • 0 kudos
brickster_2018
by Databricks Employee
  • 1237 Views
  • 1 replies
  • 1 kudos

Optimize Command not performing the bin packing

I have a daily OPTIMIZE job running, however, the number of files in the storage is not going down. Looks like the optimize is not helping to reduce the files.

  • 1237 Views
  • 1 replies
  • 1 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 1 kudos

The files are not physically removed from the Storage by the optimize command. A VACUUM command has to be executed to achieve the same

  • 1 kudos
User16790091296
by Databricks Employee
  • 16338 Views
  • 1 replies
  • 0 kudos

How to run multiple spark streaming application on databricks cluster?

I started working on databricks. I need to migrate few streaming jobs from Ambari to Databricks. I deployed one job using jar and it. is working fine. But when I deploy the second job I faced an error " multiple spark streaming context not allowed". ...

  • 16338 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Databricks Employee
  • 0 kudos

You can run multiple streaming applications in databricks clusters. By default, this would run in the same fair scheduling pool. To enable multiple streaming queries to execute jobs concurrently and to share the cluster efficiently, you can set the q...

  • 0 kudos
MoJaMa
by Databricks Employee
  • 1330 Views
  • 1 replies
  • 0 kudos
  • 1330 Views
  • 1 replies
  • 0 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 0 kudos

We still require a single user to be an owner. But you can set a group to have CAN_MANAGE which unblocks most of the necessary updates. It is released in all Premium workspaces that have Jobs ACLs. The official OWNER is whose identity is used to crea...

  • 0 kudos
User16826992666
by Databricks Employee
  • 1456 Views
  • 0 replies
  • 0 kudos

If I write functionally equivalent code in Pyspark and Koalas, will they end up evaluating to the same execution plan?

I am wondering how similar the backend execution of the two API's are. If I have code that does the same operations written in both styles, is there any functional difference between them when it comes to the execution?

  • 1456 Views
  • 0 replies
  • 0 kudos
MoJaMa
by Databricks Employee
  • 1561 Views
  • 1 replies
  • 1 kudos
  • 1561 Views
  • 1 replies
  • 1 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 1 kudos

Only HTTPS is supported right now.If SSH is required for your use case, please let your Databricks Rep know and reference the Idea DB-I-3697 so that it can be prioritized.

  • 1 kudos
MoJaMa
by Databricks Employee
  • 2026 Views
  • 1 replies
  • 0 kudos
  • 2026 Views
  • 1 replies
  • 0 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 0 kudos

You can clone any repo, the security concern is usually around proprietary code exfiltration, whether intentional or accidental.

  • 0 kudos
MoJaMa
by Databricks Employee
  • 1541 Views
  • 1 replies
  • 0 kudos
  • 1541 Views
  • 1 replies
  • 0 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 0 kudos

Feature table deletion is a potentially dangerous operation, since downstream consumers of feature tables (models, online stores, jobs, etc) may break due to the deletion. We might support a safe way to do this in future. In the meanwhile, we may be ...

  • 0 kudos
User15787040559
by Databricks Employee
  • 1784 Views
  • 1 replies
  • 1 kudos

How do you find out if the REST API calls are logged anywhere when you update an IP Access List?

In the example response at https://docs.databricks.com/security/network/ip-access-list.html{ "ip_access_list": { "list_id": "<list-id>", "label": "office", "ip_addresses": [ "1.1.1.1", "2.2.2.2/21" ], "address_co...

  • 1784 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16752239289
Databricks Employee
  • 1 kudos

The workspace audit logs should provide all workspace conf change logs. You can check service accountsManager and action name createWorkspaceConfiguration or updateWorkspaceConfiguration.

  • 1 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels