cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

brickster_2018
by Databricks Employee
  • 2555 Views
  • 1 replies
  • 1 kudos

Resolved! Cluster logs missing

On the Databricks cluster UI, when I click on the Driver logs, sometimes I see historic logs and sometimes I see logs for the last few hours. Why do we see this inconsistency

  • 2555 Views
  • 1 replies
  • 1 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 1 kudos

This is working per design! This is the expected behavior. When the cluster is in terminated state, the logs are serviced by the Spark History server hosted on the Databricks control plane. When the cluster is up and running the logs are serviced by ...

  • 1 kudos
User16790091296
by Contributor II
  • 3437 Views
  • 2 replies
  • 1 kudos

Database within a Database in Databricks

Is it possible to have a folder or database with a database in Azure Databricks? I know you can use the "create database if not exists xxx" to get a database, but I want to have folders within that database where I can put tables.

  • 3437 Views
  • 2 replies
  • 1 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 1 kudos

The default location of a database will be in the /user/hive/warehouse/<databasename.db>. Irrespective of the location of the database the tables in the database can have different locations and they can be specified at the time of creation. Databas...

  • 1 kudos
1 More Replies
User16790091296
by Contributor II
  • 1750 Views
  • 1 replies
  • 1 kudos
  • 1750 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 1 kudos

The open source spark connector for Snowflake is available by default in the Databricks runtime. To connect you can use the following code: # Use secrets DBUtil to get Snowflake credentials. user = dbutils.secrets.get("<scope>", "<secret key>") passw...

  • 1 kudos
brickster_2018
by Databricks Employee
  • 1610 Views
  • 1 replies
  • 2 kudos

Few things you should not do in Databricks!

Few things you should not do in Databricks!

  • 1610 Views
  • 1 replies
  • 2 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 2 kudos

Compared to OSS Spark, these are few things the users don't have to worry about when running the same job on Databricks. Memory management: Databricks use an internal formula to allocate the Driver and executor heap based on the size of the instance....

  • 2 kudos
User16137833804
by Databricks Employee
  • 2144 Views
  • 1 replies
  • 1 kudos
  • 2144 Views
  • 1 replies
  • 1 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 1 kudos

You could have the single node cluster where proxy is installed monitored by one of the tools like cloudwatch, azure monitor, datadog etc and have it configured to send alerts on node failure

  • 1 kudos
User16783855534
by New Contributor III
  • 1816 Views
  • 1 replies
  • 1 kudos

Can I have a Databricks Cluster that is only 1 node?

Yes you can create a "Single Node" Cluster, https://docs.databricks.com/clusters/single-node.html . It is currently not recommended to use "Single Node" cluster for streaming workloads

  • 1816 Views
  • 1 replies
  • 1 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 1 kudos

Single Node clusters should not be used for production workloads involving streaming queries, or complex computations. The intention here is to bring up the Spark cluster for all kinds of workloads

  • 1 kudos
Anonymous
by Not applicable
  • 1914 Views
  • 2 replies
  • 1 kudos

What Databricks Runtime will I have to use if I want to leverage Python 2?

I have some code which is dependent on python 2. I am not able to use Python 2 with Databricks runtime 6.0.

  • 1914 Views
  • 2 replies
  • 1 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 1 kudos

When you create a Databricks Runtime 5.5 LTS cluster by using the workspace UI, the default is Python 3. You have the option to specify Python 2. If you use the Databricks REST API to create a cluster using Databricks Runtime 5.5 LTS, the default is ...

  • 1 kudos
1 More Replies
Anonymous
by Not applicable
  • 2814 Views
  • 2 replies
  • 1 kudos
  • 2814 Views
  • 2 replies
  • 1 kudos
Latest Reply
aladda
Databricks Employee
  • 1 kudos

You can also use tags to setup a chargeback mechanism within your organization for distributed billing - https://docs.databricks.com/administration-guide/account-settings/usage-detail-tags-aws.html

  • 1 kudos
1 More Replies
User16826994223
by Honored Contributor III
  • 1536 Views
  • 1 replies
  • 1 kudos

Resolved! IDE supports in databricks

Which IDEs are integrated with Databricks till today

  • 1536 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 1 kudos

EclipseIntelliJJupyterPyCharmSBTsparklyr and RStudio DesktopSparkR and RStudio DesktopVisual Studio Code

  • 1 kudos
Anonymous
by Not applicable
  • 1829 Views
  • 1 replies
  • 1 kudos
  • 1829 Views
  • 1 replies
  • 1 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 1 kudos

mlflow is an open source framework and you could pip install mlflow in your laptop for example. https://mlflow.org/docs/latest/quickstart.html

  • 1 kudos
r_van_niekerk
by Databricks Employee
  • 2900 Views
  • 2 replies
  • 1 kudos

I have a multi-part question around Databricks integration with Splunk?

Use Case BackgroundWe have an ongoing SecOps project going live here in 4 weeks. We have set up a Splunk to monitor syslogs logs and want to integrate this with Delta. Our forwarder collect the data from remote machines then forwards data to the inde...

  • 2900 Views
  • 2 replies
  • 1 kudos
Latest Reply
aladda
Databricks Employee
  • 1 kudos

The Databricks Add-on for Splunk built as part of Databricks Labs can be leveraged for Splunk integrationIt’s a bi-directional framework that allows for in-place querying of data in databricks from within Splunk by running queries, notebooks or jobs ...

  • 1 kudos
1 More Replies
aladda
by Databricks Employee
  • 12709 Views
  • 1 replies
  • 1 kudos
  • 12709 Views
  • 1 replies
  • 1 kudos
Latest Reply
aladda
Databricks Employee
  • 1 kudos

The Databricks Add-on for Splunk built as part of Databricks Labs can be leveraged for Splunk integrationIt’s a bi-directional framework that allows for in-place querying of data in databricks from within Splunk by running queries, notebooks or jobs ...

  • 1 kudos
User16826987838
by Contributor
  • 1401 Views
  • 1 replies
  • 1 kudos
  • 1401 Views
  • 1 replies
  • 1 kudos
Latest Reply
Mooune_DBU
Valued Contributor
  • 1 kudos

With Koalas, which is a Pandas'API on top of Spark Dataframes, there should be minimal code changes required.Please refer to this blog for more info

  • 1 kudos
User16826987838
by Contributor
  • 2017 Views
  • 2 replies
  • 1 kudos

Prevent file downloads from /files/ URL

I would like to prevent file download via  /files/ URL. For example: https://customer.databricks.com/files/some-file-in-the-filestore.txtIs there a way to do this?

  • 2017 Views
  • 2 replies
  • 1 kudos
Latest Reply
Mooune_DBU
Valued Contributor
  • 1 kudos

Unfortunately this is not possible from the platform.You can however use an external Web Application Firewall (e.g. Akmai) to filter all web traffic to your workspaces.  This can block both Web access to download root bucket data.

  • 1 kudos
1 More Replies
Labels