Data Engineering

Forum Posts

Sorted by:

by saniafatimi • New Contributor II

07-29-2021 9:27:46 AM

3564 Views
1 replies
1 kudos

Need guidance on migrating power bi reports to databricks

Hi All, I want to import an existing database/tables (say AdventureWorks) to databricks. And after importing tables, I want to develop reports on top I need guidance on this. Can someone give me resources that could help me in doing things end to en...

Data Engineering

3564 Views
1 replies
1 kudos

07-29-2021 9:27:46 AM

View Replies

Latest Reply

Chris_Shehu
Valued Contributor III

09-13-2021 10:57:23 AM

1 kudos

@sania fatimi There are several different ways to do this and it's really going to depend on what your current need is. You could for example load the data into the databricks delta lake and use the databricks powerbi connecter to query the data fr...

1 kudos

09-13-2021 10:57:23 AM

by User16830818524 • Databricks Employee

06-17-2021 4:34:44 PM

2730 Views
3 replies
0 kudos

Resolved! Libraries in Databricks Runtimes

Is it possible to easily determine what libraries and which version are included in a specific DBR Version?

Data Engineering

2730 Views
3 replies
0 kudos

06-17-2021 4:34:44 PM

View Replies

Latest Reply

Anonymous
Not applicable

09-13-2021 9:16:02 AM

0 kudos

Hello. My name is Piper and I'm one of the community moderators. One of the team members sent this information to me.This should be the correct path to check libraries installed with DBRs.https://docs.databricks.com/release-notes/runtime/8.3ml.html?_...

0 kudos

09-13-2021 9:16:02 AM

2 More Replies

by Rodrigo_Brandet • New Contributor

09-10-2021 12:43:01 PM

5139 Views
3 replies
4 kudos

Resolved! Upload CSV files on Databricks by code (note UI)

Hello everyone.I have a process on databricks when I need to upload a CSV file everyday manually.I would like to know if there is a way to import this data (as panda in python, for example) with no necessary to upload this file everyday manually util...

Data Engineering

5139 Views
3 replies
4 kudos

09-10-2021 12:43:01 PM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

09-13-2021 6:57:03 AM

4 kudos

Autoloader is indeed a valid option,or use of some kind of ETL tool which fetches the file and put it somewhere on your cloud provider, like Azure Data Factory or AWS Glue etc.

4 kudos

09-13-2021 6:57:03 AM

2 More Replies

by Zen • New Contributor III

09-12-2021 7:37:13 AM

5979 Views
2 replies
3 kudos

Resolved! ssh onto Cluster as root

Hello, I'm following the instructions here:https://docs.databricks.com/clusters/configure.html?_ga=2.17611385.1712747127.1631209439-1615211488.1629573963#ssh-access-to-clustersto ssh onto the Driver node, and it's working perfectly when I ssh on as `...

Data Engineering

5979 Views
2 replies
3 kudos

09-12-2021 7:37:13 AM

View Replies

Latest Reply

cconnell
Contributor II

09-12-2021 7:45:15 AM

3 kudos

I am 99% sure that logging into a Databricks node as root will not be allowed.

3 kudos

09-12-2021 7:45:15 AM

1 More Replies

by Anonymous • Not applicable

06-21-2021 2:03:47 PM

2286 Views
2 replies
0 kudos

Resolved! What are the advantages of using Delta if I am using MLflow? How is Delta useful for DS/ML use cases?

I am already using MLflow. What benefit would Delta provide me since I am not really working on Data engineering workloads

Data Engineering

2286 Views
2 replies
0 kudos

06-21-2021 2:03:47 PM

View Replies

Latest Reply

Sebastian
Contributor

09-13-2021 6:00:01 AM

0 kudos

The most important aspect is your experiment can track the version of the data table. So during audits you will be able to trace back why a specific prediction was made.

0 kudos

09-13-2021 6:00:01 AM

1 More Replies

by brickster_2018 • Databricks Employee

06-23-2021 2:14:46 PM

3430 Views
2 replies
3 kudos

Resolved! What is the best file format for a temporary table?

As part of my ETL process, I create intermediate/staging temporary tables. These tables created are read at a later point in the ETL and finally cleaned up. Should I use Delta? Using Delta creates the overhead of running optimize jobs, which would de...

Data Engineering

3430 Views
2 replies
3 kudos

06-23-2021 2:14:46 PM

View Replies

Latest Reply

Sebastian
Contributor

09-13-2021 5:54:23 AM

3 kudos

Agree.. the intermediate delta tables helps since it brings reliability to the pipeline.

3 kudos

09-13-2021 5:54:23 AM

1 More Replies

by Nyarish • Contributor

09-12-2021 10:11:01 PM

1252 Views
0 replies
0 kudos

How to connect Neo4j aura to Databricks connection Error

I get this error: org.neo4j.driver.exceptions.SecurityException: Failed to establish secured connection with the serverI have tried to read through the documentation and tried the solution suggested but I can't seem to hack this problem.Kindly help. ...

Data Engineering

1252 Views
0 replies
0 kudos

09-12-2021 10:11:01 PM

by easimadi • New Contributor II

09-11-2021 9:53:24 PM

654 Views
0 replies
0 kudos

Hello World Databricks Community. My first post about nothing

Hello World Databricks Community.My first post about nothing

Data Engineering

654 Views
0 replies
0 kudos

09-11-2021 9:53:24 PM

by Zircoz • New Contributor II

09-11-2021 2:31:52 AM

15007 Views
2 replies
6 kudos

Resolved! Can we access the variables created in Python in Scala's code or notebook ?

If I have a dict created in python on a Scala notebook (using magic word ofcourse):%python d1 = {1: "a", 2:"b", 3:"c"}Can I access this d1 in Scala ?I tried the following and it returns d1 not found:%scala println(d1)

Data Engineering

15007 Views
2 replies
6 kudos

09-11-2021 2:31:52 AM

View Replies

Latest Reply

cpm1
New Contributor II

09-11-2021 7:38:36 AM

6 kudos

Martin is correct. We could only access the external files and objects. In most of our cases, we just use temporary views to pass data between R & Python.https://docs.databricks.com/notebooks/notebooks-use.html#mix-languages

6 kudos

09-11-2021 7:38:36 AM

1 More Replies

by Anonymous • Not applicable

09-10-2021 1:12:28 PM

3896 Views
1 replies
2 kudos

Are there any costs or quotas associated with the Databricks managed Hive metastore?

When using the default hive metastore that is managed within the Databricks control plane are there any associated costs? I.e. if I switched to an external metastore would I expect to see any reduction in my Databricks cost (ignoring total costs).Do ...

Data Engineering

3896 Views
1 replies
2 kudos

09-10-2021 1:12:28 PM

View Replies

Latest Reply

Ryan_Chynoweth
Databricks Employee

09-10-2021 2:36:31 PM

2 kudos

There are no costs associated by using the Databricks managed Hive metastore directly. Databricks pricing is on a compute consumption and not on data storage or access. The only real cost would be the compute used to access the data. I would not expe...

2 kudos

09-10-2021 2:36:31 PM

by Techmate • New Contributor

07-30-2021 11:32:26 PM

1831 Views
1 replies
0 kudos

Populating a array of date tuples Scala

Hi Friends i am trying to pass a list of date ranges needs to be in the below format. val predicates =Array(“2021-05-16” → “2021-05-17”,“2021-05-18” → “2021-05-19”,“2021-05-20” → “2021-05-21”) I am then using map to create a range of conditions that...

Data Engineering

1831 Views
1 replies
0 kudos

07-30-2021 11:32:26 PM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

09-10-2021 8:05:35 AM

0 kudos

So basically this can be done by generating 2 lists which are then zipped.One list contains the first dates of the tuples, so these are in your case 2 days apart.The other list is the 2nd dates of the tuples, also 2 days apart.Now we need a function ...

0 kudos

09-10-2021 8:05:35 AM

by dlevy • Databricks Employee

06-25-2021 12:12:58 PM

1843 Views
1 replies
1 kudos

Is there a way to see when a vacuum command is run in the delta history and can you see the retention period that was specified?

Data Engineering

1843 Views
1 replies
1 kudos

06-25-2021 12:12:58 PM

View Replies

Latest Reply

gbrueckl
Contributor II

09-10-2021 2:32:50 AM

1 kudos

I think this was added Databricks Runtime 8.2https://docs.databricks.com/release-notes/runtime/8.2.html

1 kudos

09-10-2021 2:32:50 AM

by alphaRomeo • New Contributor

08-21-2021 11:58:14 AM

5771 Views
2 replies
0 kudos

Resolved! DataBricks with MySQL data source?

I have an existing data pipeline which looks like this: A small MySQL data source (around 250 GB) and data passes through Debezium/ Kafka / a custom data redactor -> to Glue ETL jobs and finally lands on Redshift, but the scale of the data is too sm...

Data Engineering

5771 Views
2 replies
0 kudos

08-21-2021 11:58:14 AM

View Replies

Latest Reply

Dan_Z
Databricks Employee

09-09-2021 1:57:37 PM

0 kudos

There is a lot in this question, so generally speaking I suggest you reach out to the sales team at Databricks. You can talk to a solutions architect who get into more detail. Here are my general thoughts having seen a lot of customer arch:Generally,...

0 kudos

09-09-2021 1:57:37 PM

1 More Replies

by EvandroLippert_ • New Contributor

08-23-2021 6:43:14 AM

2336 Views
1 replies
0 kudos

Conflict with bitbucket and github credentials

I'm migrating my files from Bitbucket to Github, but every time that I need to clone something from bitbucket and send it to GitHub, I need to create a new token to integrate the tools. It seems that when you save a Github credential, it overrides t...

Data Engineering

2336 Views
1 replies
0 kudos

08-23-2021 6:43:14 AM

View Replies

Latest Reply

alexott
Databricks Employee

09-09-2021 10:47:54 AM

0 kudos

Cross-posting my answer from StackOverflow:Unfortunately right now it works only with a single Git provider. It looks like that you're linking individual notebooks into Git repository. You can simplify things by cloning the Bitbucket repository(-ies)...

0 kudos

09-09-2021 10:47:54 AM

by Alex_G • New Contributor II

09-08-2021 10:40:23 AM

2819 Views
1 replies
4 kudos

Resolved! Databricks Feature Store in MLFlow run CLI command

Hello!I am attempting to move some machine learning code from a databricks notebook into a mlflow git repository. I am utilizing the databricks feature store to load features that have been processed. Currently I cannot get the databricks library to ...

Data Engineering

2819 Views
1 replies
4 kudos

09-08-2021 10:40:23 AM

View Replies

Latest Reply

sean_owen
Databricks Employee

09-09-2021 10:35:58 AM

4 kudos

Hm, what error do you get? I believe you won't be able to specify the feature store library as a dependency, as it's not externally published yet, but code that uses it should run on DB ML runtimes as it already exists there

4 kudos

09-09-2021 10:35:58 AM

Databricks Community

Forum Posts

Need guidance on migrating power bi reports to databricks

Resolved! Libraries in Databricks Runtimes

Resolved! Upload CSV files on Databricks by code (note UI)

Resolved! ssh onto Cluster as root

Resolved! What are the advantages of using Delta if I am using MLflow? How is Delta useful for DS/ML use cases?

Resolved! What is the best file format for a temporary table?

How to connect Neo4j aura to Databricks connection Error

Hello World Databricks Community. My first post about nothing

Resolved! Can we access the variables created in Python in Scala's code or notebook ?

Are there any costs or quotas associated with the Databricks managed Hive metastore?

Populating a array of date tuples Scala

Is there a way to see when a vacuum command is run in the delta history and can you see the retention period that was specified?

Resolved! DataBricks with MySQL data source?

Conflict with bitbucket and github credentials

Resolved! Databricks Feature Store in MLFlow run CLI command

Join Us as a Local Community Builder!

How to restrict the values permitted in a job or t...

how to avoid extra column after retry upon Unknown...

user standard serverless with asset bundle on Azur...

ONLY PNG format is available for databricks dashbo...

How to create a Unity Catalog Connection to SQL Se...