Data Engineering

Forum Posts

Sorted by:

by enichante • New Contributor

02-15-2022 9:09:29 PM

1497 Views
4 replies
5 kudos

Resolved! Databricks: Report on SQL queries that are being executed

We have a SQL workspace with a cluster running that services a number of self service reports against a range of datasets. We want to be able to analyse and report on the queries our self service users are executing so we can get better visibility of...

Data Engineering

1497 Views
4 replies
5 kudos

02-15-2022 9:09:29 PM

View Replies

Latest Reply

Anonymous
Not applicable

02-22-2022 2:34:46 PM

5 kudos

Looks like the people have spoken: API is your best option! (thanks @Werner Stinckens @Chris Grabiel and @Bilal Aslam !) @eni chante Let us know if you have questions about the API! If not, please mark one of the replies above as the "best answ...

5 kudos

02-22-2022 2:34:46 PM

3 More Replies

by Juniper_AIML • New Contributor

02-15-2022 8:50:49 AM

1562 Views
2 replies
1 kudos

Resolved! How to setup Instance profile for initializing Databricks Cluster using Docker?

I was trying to start of the Databricks cluster through a docker image. I followed the setup instruction. Excluding the additional setup to setup the IAM role and instance profile as I was facing issues.The image is stored on AWS ECR in a public repo...

Data Engineering

1562 Views
2 replies
1 kudos

02-15-2022 8:50:49 AM

View Replies

Latest Reply

Kaniz
Community Manager

02-22-2022 1:02:41 PM

1 kudos

Hi @Aman Gaurav , Please check the below requirements to avail the Databricks Container Services.Note :-Databricks Runtime for Machine Learning and Databricks Runtime for Genomics does not support Databricks Container Services.Databricks Runtime 6.1...

1 kudos

02-22-2022 1:02:41 PM

1 More Replies

by alejandrofm • Valued Contributor

02-15-2022 11:26:04 AM

1571 Views
2 replies
0 kudos

Resolved! Can't enable CLI 2.1 on CI

Hi! this is my CI configuration, I added the databricks jobs configure --version=2.1 command but it stills showing this error, any idea of what can I be doing wrong?Error:Resetting Databricks Job with job_id 1036...WARN: Your CLI is configured to use...

Data Engineering

1571 Views
2 replies
0 kudos

02-15-2022 11:26:04 AM

View Replies

Latest Reply

Kaniz
Community Manager

02-22-2022 12:45:55 PM

0 kudos

Hi @Alejandro Martinez , To set up and use the Databricks jobs CLI (and the job runs CLI) to call the Jobs REST API 2.1, Update the CLI to version 0.16.0 or above.Run pip install databricks-cli --upgrade using the appropriate version of pip for your...

0 kudos

02-22-2022 12:45:55 PM

1 More Replies

by SailajaB • Valued Contributor III

02-08-2022 5:07:00 AM

2857 Views
4 replies
5 kudos

Resolved! Ways to validate final Dataframe schema against JSON schema config file

Hi Team,We have to validate transformed dataframe output schema with json schema config file.Here is the scenario Our input json schema and target json schema are different. Using Databricks we are doing the required schema changes. Now, we need to v...

Data Engineering

2857 Views
4 replies
5 kudos

02-08-2022 5:07:00 AM

View Replies

Latest Reply

Anonymous
Not applicable

02-09-2022 7:57:39 AM

5 kudos

@Sailaja B - Hi! My name is Piper, and I'm a moderator for the community. Thanks for your question. Please let us know how things go. If @welder martins' response answers your question, would you be happy to come back and mark their answer as best?...

5 kudos

02-09-2022 7:57:39 AM

3 More Replies

by cristianc • Contributor

02-21-2022 5:45:10 AM

2327 Views
3 replies
2 kudos

Resolved! Is VACUUM operation recorded in the history of the delta table?

Greetings,I have tried using Spark with DBR 9.1 LTS to run VACUUM on my delta table then DESCRIBE HISTORY to see the operation, but apparently the VACUUM operation was not in the history despite the things stated in the documentation from: https://do...

Data Engineering

2327 Views
3 replies
2 kudos

02-21-2022 5:45:10 AM

View Replies

Latest Reply

cristianc
Contributor

02-22-2022 12:44:46 AM

2 kudos

That makes sense, thanks for the reply!

2 kudos

02-22-2022 12:44:46 AM

2 More Replies

by adnanzak • New Contributor II

02-20-2022 5:43:29 PM

1779 Views
3 replies
0 kudos

Resolved! Deploy Databricks Machine Learing Models On Power BI

Hi Guys. I've implemented a Machine Learning model on Databricks and have registered it with a Model URL. I wanted to enquire if I could use this model on Power BI. Basically the model predicts industries based on client demographics. Ideally I would...

Data Engineering

1779 Views
3 replies
0 kudos

02-20-2022 5:43:29 PM

View Replies

Latest Reply

adnanzak
New Contributor II

02-21-2022 2:36:20 PM

0 kudos

Thank you @Werner Stinckens and @Joseph Kambourakis for your replies.

0 kudos

02-21-2022 2:36:20 PM

2 More Replies

by Anonymous • Not applicable

02-10-2022 4:12:03 AM

1230 Views
2 replies
3 kudos

Issue in creating workspace - Custom AWS Configuration

We have tried to create new workspace using "Custom AWS Configuration" and we have given our own VPC (Customer managed VPC) and tried but workspace failed to launch. We are getting below error which couldn't understand where the issue is in.Workspace...

Data Engineering

1230 Views
2 replies
3 kudos

02-10-2022 4:12:03 AM

View Replies

Latest Reply

Mitesh_Patel
New Contributor III

02-20-2022 9:03:25 PM

3 kudos

I'm also getting the same issue. I'm trying to create a E2 workspace using Terraform with Customer-managed VPC in us-east-1 (using private subnets for 1a and 1b). We have 1 network rule attached to our subnets that looks like this: Similar question ...

3 kudos

02-20-2022 9:03:25 PM

1 More Replies

by BasavarajAngadi • Contributor

02-13-2022 8:43:11 PM

1123 Views
7 replies
9 kudos

Resolved! Hi Experts , I am new to databricks. I want to know how to copy pyspark data into databricks SQL analytics ?

If we use two different clusters one for pyspark code for transformation and one for SQL analytics . how to make permenant tables derived from pyspark code make available for running queries in databricks SQL analytics

Data Engineering

1123 Views
7 replies
9 kudos

02-13-2022 8:43:11 PM

View Replies

Latest Reply

BasavarajAngadi
Contributor

02-18-2022 8:03:34 AM

9 kudos

@Aman Sehgal Can we write data from data engineering workspace to SQL end point in databricks?

9 kudos

02-18-2022 8:03:34 AM

6 More Replies

by sudhanshu1 • New Contributor III

02-16-2022 2:33:02 PM

2156 Views
2 replies
1 kudos

Resolved! Query to know all tables and columns name in delta lake

Hi all,Does anyone know how to write simple SQL query to get all tables and columns name. In oracle we do ,select * from all tab columns. Similarly in SQL server we do select * from information schema . columns.Do we have something like this in dat...

Data Engineering

2156 Views
2 replies
1 kudos

02-16-2022 2:33:02 PM

View Replies

Latest Reply

Kaniz
Community Manager

02-18-2022 4:59:32 AM

1 kudos

Hi @SUDHANSHU RAJ , Using Databricks, you do not get such a simplistic set of objects. What you have instead is:SHOW DATABASES command for viewing all databases/schemasSHOW TABLES command for viewing all tables within a databaseSHOW COLUMNS command ...

1 kudos

02-18-2022 4:59:32 AM

1 More Replies

by alejandrofm • Valued Contributor

02-12-2022 1:30:44 PM

1163 Views
3 replies
1 kudos

Resolved! Recommendations to execute OPTIMIZE on tables

Hi, have Databricks running on AWS, I'm looking for a way to know when is a good time to run optimize on partitioned tables. Taking into account that it's an expensive process, especially on big tables, how could I know if it's a good time to run it ...

Data Engineering

1163 Views
3 replies
1 kudos

02-12-2022 1:30:44 PM

View Replies

Latest Reply

Anonymous
Not applicable

02-17-2022 8:24:19 AM

1 kudos

@Alejandro Martinez - If Jose's answer resolved your question, would you be happy to mark his answer as best? That helps other members find the answer more quickly.

1 kudos

02-17-2022 8:24:19 AM

2 More Replies

by BenzDriver • New Contributor II

02-17-2022 8:06:08 AM

1246 Views
2 replies
1 kudos

Resolved! SQL command FSCK is not found

Hello there,I currently have the problem of deleted files still being in the transaction log when trying to call a delta table. What I found was this statement:%sql FSCK REPAIR TABLE table_name [DRY RUN]But using it returned following error:Error in ...

Data Engineering

1246 Views
2 replies
1 kudos

02-17-2022 8:06:08 AM

View Replies

Latest Reply

RKNutalapati
Valued Contributor

02-17-2022 8:14:37 AM

1 kudos

Remove square brackets and try executing the command%sqlFSCK REPAIR TABLE table_name DRY RUN

1 kudos

02-17-2022 8:14:37 AM

1 More Replies

by qyu • New Contributor II

02-11-2022 10:21:13 AM

7845 Views
4 replies
3 kudos

Resolved! Need help with this python import error.

I am using databricks runtime 9.1 LTS ML and I got this error when I tried to import Scikit Learn package. I got the following error message:TypeError Traceback (most recent call last) <command-181041> in <module> ...

Data Engineering

7845 Views
4 replies
3 kudos

02-11-2022 10:21:13 AM

View Replies

Latest Reply

qyu
New Contributor II

02-14-2022 3:43:17 PM

3 kudos

@Atanu Sarkar I am using databricks runtime 9.1ML LTS and python version is 3.8.10I am only just running import statementfrom sklearn.metrics import * from sklearn.preprocessing import LabelEncoder

3 kudos

02-14-2022 3:43:17 PM

3 More Replies

by RK_AV • New Contributor III

02-09-2022 12:05:30 PM

717 Views
2 replies
1 kudos

Resolved! Databricks to Databricks connection

How do you connect to Azure Databricks instance from another Databricks instance? I needed to access (database) Views created in a Databricks instance from a Pyspark notebook running in another Databricks instance. Appreciate if anyone has any sample...

Data Engineering

717 Views
2 replies
1 kudos

02-09-2022 12:05:30 PM

View Replies

Latest Reply

Kaniz
Community Manager

02-17-2022 5:15:57 AM

1 kudos

Hi @Venkata Ramakrishna Alvakonda , The two ways of executing a notebook within another notebook in Databricks are:-Method #[1: %run command] The first and the most straightforward way of executing another notebook is by using the %run command. Ex...

1 kudos

02-17-2022 5:15:57 AM

1 More Replies

by Databricks_Venk • New Contributor

02-09-2022 2:39:27 PM

5804 Views
2 replies
2 kudos

Resolved! Error while mounting to ADLS Gen2 - Error: Secret does not exist with scope "*" and Key : "**"

Data Engineering

5804 Views
2 replies
2 kudos

02-09-2022 2:39:27 PM

View Replies

Latest Reply

Kaniz
Community Manager

02-17-2022 4:40:12 AM

2 kudos

Hi @Venkata Vadapalli , You may follow the below steps to create a mount point using Azure Key-vault.You should have the following information:• Client ID (a.k.a. Application ID) => Key Name as ClientID = 06exxxxxxxxxxd60ef• Client Secret (a.k.a. Ap...

2 kudos

02-17-2022 4:40:12 AM

1 More Replies

by danielveraec • New Contributor III

02-14-2022 7:55:08 AM

5705 Views
3 replies
1 kudos

Resolved! Error writing a partitioned Delta Table from a multitasking job in azure databricks

I have a notebook that writes a delta table with a statement similar to the following:match = "current.country = updates.country and current.process_date = updates.process_date" deltaTable = DeltaTable.forPath(spark, silver_path) deltaTable.alias("cu...

Data Engineering

5705 Views
3 replies
1 kudos

02-14-2022 7:55:08 AM

View Replies

Latest Reply

danielveraec
New Contributor III

02-16-2022 11:20:26 AM

1 kudos

Initially, the affected table only had a date field as partition. So I partitioned it with country and date fields. This new partition created the country and date directories however the old directories of the date partition remained and were not de...

1 kudos

02-16-2022 11:20:26 AM

2 More Replies

User

Count

1601

736

343

284

246

Databricks

Forum Posts

Resolved! Databricks: Report on SQL queries that are being executed

Resolved! How to setup Instance profile for initializing Databricks Cluster using Docker?

Resolved! Can't enable CLI 2.1 on CI

Resolved! Ways to validate final Dataframe schema against JSON schema config file

Resolved! Is VACUUM operation recorded in the history of the delta table?

Resolved! Deploy Databricks Machine Learing Models On Power BI

Issue in creating workspace - Custom AWS Configuration

Resolved! Hi Experts , I am new to databricks. I want to know how to copy pyspark data into databricks SQL analytics ?

Resolved! Query to know all tables and columns name in delta lake

Resolved! Recommendations to execute OPTIMIZE on tables

Resolved! SQL command FSCK is not found

Resolved! Need help with this python import error.

Resolved! Databricks to Databricks connection

Resolved! Error while mounting to ADLS Gen2 - Error: Secret does not exist with scope "*" and Key : "**"

Resolved! Error writing a partitioned Delta Table from a multitasking job in azure databricks

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...

Databricks

Resolved! Databricks: Report on SQL queries that are being executed

Resolved! How to setup Instance profile for initializing Databricks Cluster using Docker?

Resolved! Can't enable CLI 2.1 on CI

Resolved! Ways to validate final Dataframe schema against JSON schema config file

Resolved! Is VACUUM operation recorded in the history of the delta table?

Resolved! Deploy Databricks Machine Learing Models On Power BI

Issue in creating workspace - Custom AWS Configuration

Resolved! Hi Experts , I am new to databricks. I want to know how to copy pyspark data into databricks SQL analytics ?

Resolved! Query to know all tables and columns name in delta lake

Resolved! Recommendations to execute OPTIMIZE on tables

Resolved! SQL command FSCK is not found

Resolved! Need help with this python import error.

Resolved! Databricks to Databricks connection

Resolved! Error while mounting to ADLS Gen2 - Error: Secret does not exist with scope "*****" and Key : "******"

Resolved! Error writing a partitioned Delta Table from a multitasking job in azure databricks

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...

Resolved! Error while mounting to ADLS Gen2 - Error: Secret does not exist with scope "*" and Key : "**"