We have a SQL workspace with a cluster running that services a number of self service reports against a range of datasets. We want to be able to analyse and report on the queries our self service users are executing so we can get better visibility of...
Looks like the people have spoken: API is your best option! (thanks @Werner Stinckens​ @Chris Grabiel​ and @Bilal Aslam​ !) @eni chante​ Let us know if you have questions about the API! If not, please mark one of the replies above as the "best answ...
I was trying to start of the Databricks cluster through a docker image. I followed the setup instruction. Excluding the additional setup to setup the IAM role and instance profile as I was facing issues.The image is stored on AWS ECR in a public repo...
Hi @Aman Gaurav​ , Please check the below requirements to avail the Databricks Container Services.Note :-Databricks Runtime for Machine Learning and Databricks Runtime for Genomics does not support Databricks Container Services.Databricks Runtime 6.1...
Hi! this is my CI configuration, I added the databricks jobs configure --version=2.1 command but it stills showing this error, any idea of what can I be doing wrong?Error:Resetting Databricks Job with job_id 1036...WARN: Your CLI is configured to use...
Hi @Alejandro Martinez​ , To set up and use the Databricks jobs CLI (and the job runs CLI) to call the Jobs REST API 2.1, Update the CLI to version 0.16.0 or above.Run pip install databricks-cli --upgrade using the appropriate version of pip for your...
Hi Team,We have to validate transformed dataframe output schema with json schema config file.Here is the scenario Our input json schema and target json schema are different. Using Databricks we are doing the required schema changes. Now, we need to v...
@Sailaja B​ - Hi! My name is Piper, and I'm a moderator for the community. Thanks for your question. Please let us know how things go. If @welder martins​' response answers your question, would you be happy to come back and mark their answer as best?...
Greetings,I have tried using Spark with DBR 9.1 LTS to run VACUUM on my delta table then DESCRIBE HISTORY to see the operation, but apparently the VACUUM operation was not in the history despite the things stated in the documentation from: https://do...
Hi Guys. I've implemented a Machine Learning model on Databricks and have registered it with a Model URL. I wanted to enquire if I could use this model on Power BI. Basically the model predicts industries based on client demographics. Ideally I would...
We have tried to create new workspace using "Custom AWS Configuration" and we have given our own VPC (Customer managed VPC) and tried but workspace failed to launch. We are getting below error which couldn't understand where the issue is in.Workspace...
I'm also getting the same issue. I'm trying to create a E2 workspace using Terraform with Customer-managed VPC in us-east-1 (using private subnets for 1a and 1b). We have 1 network rule attached to our subnets that looks like this: Similar question ...
If we use two different clusters one for pyspark code for transformation and one for SQL analytics . how to make permenant tables derived from pyspark code make available for running queries in databricks SQL analytics
Hi all,​Does anyone know how to write simple SQL query to get all tables and columns name. In oracle we do ,select * from all tab columns. Similarly in SQL server we do select * from information schema . columns.​Do we have something like this in dat...
Hi @SUDHANSHU RAJ​ , Using Databricks, you do not get such a simplistic set of objects. What you have instead is:SHOW DATABASES command for viewing all databases/schemasSHOW TABLES command for viewing all tables within a databaseSHOW COLUMNS command ...
Hi, have Databricks running on AWS, I'm looking for a way to know when is a good time to run optimize on partitioned tables. Taking into account that it's an expensive process, especially on big tables, how could I know if it's a good time to run it ...
@Alejandro Martinez​ - If Jose's answer resolved your question, would you be happy to mark his answer as best? That helps other members find the answer more quickly.
Hello there,I currently have the problem of deleted files still being in the transaction log when trying to call a delta table. What I found was this statement:%sql
FSCK REPAIR TABLE table_name [DRY RUN]But using it returned following error:Error in ...
I am using databricks runtime 9.1 LTS ML and I got this error when I tried to import Scikit Learn package. I got the following error message:TypeError Traceback (most recent call last)
<command-181041> in <module>
...
@Atanu Sarkar​ I am using databricks runtime 9.1ML LTS and python version is 3.8.10I am only just running import statementfrom sklearn.metrics import *
from sklearn.preprocessing import LabelEncoder
How do you connect to Azure Databricks instance from another Databricks instance? I needed to access (database) Views created in a Databricks instance from a Pyspark notebook running in another Databricks instance. Appreciate if anyone has any sample...
Hi @Venkata Ramakrishna Alvakonda​ , The two ways of executing a notebook within another notebook in Databricks are:-Method #[1: %run command]​ The first and the most straightforward way of executing another notebook is by using the %run command. Ex...
Hi @Venkata Vadapalli​ , You may follow the below steps to create a mount point using Azure Key-vault.You should have the following information:• Client ID (a.k.a. Application ID) => Key Name as ClientID = 06exxxxxxxxxxd60ef• Client Secret (a.k.a. Ap...
I have a notebook that writes a delta table with a statement similar to the following:match = "current.country = updates.country and current.process_date = updates.process_date"
deltaTable = DeltaTable.forPath(spark, silver_path)
deltaTable.alias("cu...
Initially, the affected table only had a date field as partition. So I partitioned it with country and date fields. This new partition created the country and date directories however the old directories of the date partition remained and were not de...