Topics with Label: Data Ingestion & connectivity

Forum Posts

Sorted by:

by brickster_2018 • Databricks Employee

06-24-2021 11:10:13 AM

2555 Views
1 replies
1 kudos

Resolved! Cluster logs missing

On the Databricks cluster UI, when I click on the Driver logs, sometimes I see historic logs and sometimes I see logs for the last few hours. Why do we see this inconsistency

Data Engineering

2555 Views
1 replies
1 kudos

06-24-2021 11:10:13 AM

View Replies

Latest Reply

brickster_2018
Databricks Employee

06-24-2021 11:12:12 AM

1 kudos

This is working per design! This is the expected behavior. When the cluster is in terminated state, the logs are serviced by the Spark History server hosted on the Databricks control plane. When the cluster is up and running the logs are serviced by ...

1 kudos

06-24-2021 11:12:12 AM

by User16790091296 • Contributor II

06-24-2021 8:10:10 AM

3437 Views
2 replies
1 kudos

Database within a Database in Databricks

Is it possible to have a folder or database with a database in Azure Databricks? I know you can use the "create database if not exists xxx" to get a database, but I want to have folders within that database where I can put tables.

Data Engineering

3437 Views
2 replies
1 kudos

06-24-2021 8:10:10 AM

View Replies

Latest Reply

brickster_2018
Databricks Employee

06-24-2021 11:02:02 AM

1 kudos

The default location of a database will be in the /user/hive/warehouse/<databasename.db>. Irrespective of the location of the database the tables in the database can have different locations and they can be specified at the time of creation. Databas...

1 kudos

06-24-2021 11:02:02 AM

1 More Replies

by User16790091296 • Contributor II

06-24-2021 8:26:41 AM

1750 Views
1 replies
1 kudos

How to connect Databricks to Snowflake using Python?

Data Engineering

1750 Views
1 replies
1 kudos

06-24-2021 8:26:41 AM

View Replies

Latest Reply

Ryan_Chynoweth
Esteemed Contributor

06-24-2021 8:32:24 AM

1 kudos

The open source spark connector for Snowflake is available by default in the Databricks runtime. To connect you can use the following code: # Use secrets DBUtil to get Snowflake credentials. user = dbutils.secrets.get("<scope>", "<secret key>") passw...

1 kudos

06-24-2021 8:32:24 AM

by brickster_2018 • Databricks Employee

06-23-2021 11:07:07 PM

1610 Views
1 replies
2 kudos

Few things you should not do in Databricks!

Data Engineering

1610 Views
1 replies
2 kudos

06-23-2021 11:07:07 PM

View Replies

Latest Reply

brickster_2018
Databricks Employee

06-23-2021 11:11:41 PM

2 kudos

Compared to OSS Spark, these are few things the users don't have to worry about when running the same job on Databricks. Memory management: Databricks use an internal formula to allocate the Driver and executor heap based on the size of the instance....

2 kudos

06-23-2021 11:11:41 PM

by User16137833804 • Databricks Employee

06-23-2021 1:14:27 PM

2144 Views
1 replies
1 kudos

Once I set up the Git Server Proxy, what would be the best way to set alerts in case the Cluster Proxy goes down?

Data Engineering

2144 Views
1 replies
1 kudos

06-23-2021 1:14:27 PM

View Replies

Latest Reply

sajith_appukutt
Honored Contributor II

06-23-2021 7:51:55 PM

1 kudos

You could have the single node cluster where proxy is installed monitored by one of the tools like cloudwatch, azure monitor, datadog etc and have it configured to send alerts on node failure

1 kudos

06-23-2021 7:51:55 PM

by User16783855534 • New Contributor III

06-23-2021 12:46:04 PM

1816 Views
1 replies
1 kudos

Can I have a Databricks Cluster that is only 1 node?

Yes you can create a "Single Node" Cluster, https://docs.databricks.com/clusters/single-node.html . It is currently not recommended to use "Single Node" cluster for streaming workloads

Data Engineering

1816 Views
1 replies
1 kudos

06-23-2021 12:46:04 PM

View Replies

Latest Reply

brickster_2018
Databricks Employee

06-23-2021 2:17:13 PM

1 kudos

Single Node clusters should not be used for production workloads involving streaming queries, or complex computations. The intention here is to bring up the Spark cluster for all kinds of workloads

1 kudos

06-23-2021 2:17:13 PM

by Anonymous • Not applicable

06-22-2021 7:39:16 PM

1914 Views
2 replies
1 kudos

What Databricks Runtime will I have to use if I want to leverage Python 2?

I have some code which is dependent on python 2. I am not able to use Python 2 with Databricks runtime 6.0.

Data Engineering

1914 Views
2 replies
1 kudos

06-22-2021 7:39:16 PM

View Replies

Latest Reply

User16826994223
Honored Contributor III

06-22-2021 10:49:34 PM

1 kudos

When you create a Databricks Runtime 5.5 LTS cluster by using the workspace UI, the default is Python 3. You have the option to specify Python 2. If you use the Databricks REST API to create a cluster using Databricks Runtime 5.5 LTS, the default is ...

1 kudos

06-22-2021 10:49:34 PM

1 More Replies

by Anonymous • Not applicable

06-16-2021 2:01:38 PM

2814 Views
2 replies
1 kudos

Resolved! Does Databricks have alerts / thresholds in place for cost monitoring?

Data Engineering

2814 Views
2 replies
1 kudos

06-16-2021 2:01:38 PM

View Replies

Latest Reply

aladda
Databricks Employee

06-22-2021 9:00:31 PM

1 kudos

You can also use tags to setup a chargeback mechanism within your organization for distributed billing - https://docs.databricks.com/administration-guide/account-settings/usage-detail-tags-aws.html

1 kudos

06-22-2021 9:00:31 PM

1 More Replies

by User16826994223 • Honored Contributor III

06-22-2021 5:14:01 AM

1536 Views
1 replies
1 kudos

Resolved! IDE supports in databricks

Which IDEs are integrated with Databricks till today

Data Engineering

1536 Views
1 replies
1 kudos

06-22-2021 5:14:01 AM

View Replies

Latest Reply

User16826994223
Honored Contributor III

06-22-2021 5:14:39 AM

1 kudos

EclipseIntelliJJupyterPyCharmSBTsparklyr and RStudio DesktopSparkR and RStudio DesktopVisual Studio Code

1 kudos

06-22-2021 5:14:39 AM

by Anonymous • Not applicable

06-21-2021 1:53:12 PM

1829 Views
1 replies
1 kudos

Resolved! Can I use mlflow locally on my machine or does it always have to be through Databricks?

Would it require DB connect / DB CLI / API?

Data Engineering

1829 Views
1 replies
1 kudos

06-21-2021 1:53:12 PM

View Replies

Latest Reply

sajith_appukutt
Honored Contributor II

06-21-2021 2:15:45 PM

1 kudos

mlflow is an open source framework and you could pip install mlflow in your laptop for example. https://mlflow.org/docs/latest/quickstart.html

1 kudos

06-21-2021 2:15:45 PM

by r_van_niekerk • Databricks Employee

06-07-2021 11:22:53 AM

2900 Views
2 replies
1 kudos

I have a multi-part question around Databricks integration with Splunk?

Use Case BackgroundWe have an ongoing SecOps project going live here in 4 weeks. We have set up a Splunk to monitor syslogs logs and want to integrate this with Delta. Our forwarder collect the data from remote machines then forwards data to the inde...

Data Engineering

2900 Views
2 replies
1 kudos

06-07-2021 11:22:53 AM

View Replies

Latest Reply

aladda
Databricks Employee

06-21-2021 1:28:46 PM

1 kudos

The Databricks Add-on for Splunk built as part of Databricks Labs can be leveraged for Splunk integrationIt’s a bi-directional framework that allows for in-place querying of data in databricks from within Splunk by running queries, notebooks or jobs ...

1 kudos

06-21-2021 1:28:46 PM

1 More Replies

by aladda • Databricks Employee

06-21-2021 1:05:23 PM

12709 Views
1 replies
1 kudos

Resolved! Does Databricks integrate with Splunk? What are some ways to send metrics/logs to Splunk

Data Engineering

12709 Views
1 replies
1 kudos

06-21-2021 1:05:23 PM

View Replies

Latest Reply

aladda
Databricks Employee

06-21-2021 1:05:50 PM

1 kudos

1 kudos

06-21-2021 1:05:50 PM

by User16826987838 • Contributor

06-18-2021 5:50:25 PM

1401 Views
1 replies
1 kudos

We are trying to migrate from Dask/Pandas to Databricks. Any gotchas we need to be aware of?

Data Engineering

1401 Views
1 replies
1 kudos

06-18-2021 5:50:25 PM

View Replies

Latest Reply

Mooune_DBU
Valued Contributor

06-21-2021 11:45:10 AM

1 kudos

With Koalas, which is a Pandas'API on top of Spark Dataframes, there should be minimal code changes required.Please refer to this blog for more info

1 kudos

06-21-2021 11:45:10 AM

by User16826987838 • Contributor

06-18-2021 3:33:07 PM

2017 Views
2 replies
1 kudos

Prevent file downloads from /files/ URL

I would like to prevent file download via /files/ URL. For example: https://customer.databricks.com/files/some-file-in-the-filestore.txtIs there a way to do this?

Data Engineering

2017 Views
2 replies
1 kudos

06-18-2021 3:33:07 PM

View Replies

Latest Reply

Mooune_DBU
Valued Contributor

06-18-2021 4:46:13 PM

1 kudos

Unfortunately this is not possible from the platform.You can however use an external Web Application Firewall (e.g. Akmai) to filter all web traffic to your workspaces. This can block both Web access to download root bucket data.

1 kudos

06-18-2021 4:46:13 PM

1 More Replies

by User16826987838 • Contributor

06-18-2021 3:27:26 PM

2631 Views
1 replies
1 kudos

Is it possible to run Databricks on Amazon EKS and not directly on EC2 instances?

Data Engineering

2631 Views
1 replies
1 kudos

06-18-2021 3:27:26 PM

View Replies

Latest Reply

sajith_appukutt
Honored Contributor II

06-18-2021 4:21:11 PM

1 kudos

Databricks on AWS uses a custom cluster manager and not Kubernetes. EKS is not supported yet

1 kudos

06-18-2021 4:21:11 PM