Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
I am trying to access the following system tables to generate a DBU consumption report, but I am not seeing this table in the system schema. Could you please help me how to access it?system.billing.inventory, system.billing.workspaces, system.billing...
What information do you know about a share recipient when they access a table shared to them via Delta Sharing?Wondering if we might be able to utilize something along the lines of is_member, is_account_group_member, session_user, etc for ROW and COL...
Now that I'm looking closer at the share credentials and the recipient entity you would really need a way to know the bearer token and relate that back to various recipient properties - databricks.name and any custom recipient property tags you may h...
Hi,I have a spark streaming job which reads from kafka and process data and write to delta lake.Number of kafka partition: 100number of executor: 2 (4 core each)So we have 8 cores total which are reading from 100 partitions of a topic. I wanted to un...
I am able to use vscode extension + databricks connect to develop Notebooks on my local computer and run them on my databricks cluster. However I can not figure out how to develop the Notebooks that have the file `.py` extension but identified by Dat...
Hi All,I am facing issue while running a new table in bronze layer.Error - AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table.com.databricks.backend.common.rpc.SparkDriverExceptions$SQLExecutionException: org.a...
Hello @Mirza1 ,
Could you please share the source code that is generating the exception, as well as the DBR version you are currently using? This will help me better understand the issue.
Hi Databricks team,I am trying to understand internals of spark coalesce code(DefaultPartitionCoalescer) and going through spark code for this. While I understood coalesce function but I am not sure about complete flow of code like where its get call...
Hello @subham0611 ,
The coalesce operation triggered from user code can be initiated from either an RDD or a Dataset, with each having distinct codepaths:
RDD: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD...
@georgeyjy Try opening the CSV as text editor. I bet that Excel is automatically trying to detect the schema of CSV thus it thinks that it's an integer.
Reading file like this "Data = spark.sql("SELECT * FROM edge.inv.rm") Getting this error org.apache.spark.SparkException: Job aborted due to stage failure: Task 10 in stage 441.0 failed 4 times, most recent failure: Lost task 10.3 in stage 441.0 (TID...
Assessment(Assessment job need to be deployed using Terraform)1.Install latest version of UCX 2.UCX will add the assessment job and queries to the workspace3.Run the assessment using ClusterHow to write code for this by using Terraform. Can anyone he...
I am trying to generate PAT for a service principle.I am following the documentation as shown below:https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html#create-token-in-accountI have prepared the below curl command:I am getting below error:Pl...
I was able to generate the workspace level token using the databricks cli.I set the following details in the databricks cli profile(.databrickscfg) file: host = https://myworksapce.azuredatabricks.net/ account_id = (my db account id)client_id = ...
Hi Community Members,I have been using Databricks for a while, but I have only used Workflow. I have a question about the differences between Delta Live Table and Workflow. Which one should we use in which scenario?Thanks,
Hi, Delta Live Tables focuses on managing data ingestion, transformation, and management of Delta tables using a declarative framework. Job Workflows are designed to orchestrate and schedule various data processing and analysis tasks, including SQL q...
Hello,Good afternoon great people. I was following the step-by-step instructions to enable or disable Databricks Assistant in my Databricks Community Edition to enable the AI assistance. However, I couldn't find the option and was unable to enable it...
Hi,We are trying to use the dbt-sql template provided for databricks asset bundles but getting error as follows: Looks like its regarding default catalog configuration. Has anyone faced this previously or can help with the same
I'm using DAB to deploy a "jobs" resource into Databeicks and into two environments: "dev" and "prod". I pull the notebooks from a remote git repository using "git_resource", and defined the default job to use a tag to find which version to pull. Ho...
I use target overrides to switch between branch and tags on different environments: resources:
jobs:
my_job:
git_source:
git_url: <REPO-URL>
git_provider: gitHub
targets:
staging:
resources:
jobs:
my_j...
If you observe a hung job, thread dumps are crucial to determine the root cause. Hence, it would be a good idea to collect the thread dumps before cancelling the hung job.
Here are the Instructions to collect the Spark driver/executor thread dump:
​...