Community Discussions

by Ramakrishnan83 • New Contributor III

03-18-2024 7:34:07 AM

1597 Views
1 replies
0 kudos

Resolved! Understanding Spark Architecture during Table Creation

Team ,I am trying understand how the parquet files and JSON under the delta log folder stores the data behind the scenesTable Creation:from delta.tables import *DeltaTable.create(spark) \.tableName("employee") \.addColumn("id", "INT") \.addColumn("na...

Community Discussions

Reply

1597 Views
1 replies
0 kudos

03-18-2024 7:34:07 AM

View Replies

Latest Reply

shan_chandra
Esteemed Contributor

03-18-2024 3:08:54 PM

0 kudos

@Ramakrishnan83 - Kindly go through the blog post - https://www.databricks.com/blog/2019/08/21/diving-into-delta-lake-unpacking-the-transaction-log.html which discuss in detail on delta's transaction log.

0 kudos

03-18-2024 3:08:54 PM

by ivanychev • Contributor

02-22-2024 9:22:42 AM

1452 Views
2 replies
1 kudos

Corrupted Python installation on Python restart on DBR 13.3

Hey there, we're using DBR 13.3 (no Docker) as general purpose cluster and init the cluster using the following init script:```#!/usr/bin/env bashexport DEBIAN_FRONTEND=noninteractiveset -euxo pipefailif [[ $DB_IS_DRIVER = "TRUE" ]]; thenecho "I am d...

Community Discussions

Reply

1452 Views
2 replies
1 kudos

02-22-2024 9:22:42 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

03-18-2024 8:58:06 AM

1 kudos

Hi @ivanychev , Let me get some of our experts here at Databricks to answer your question. Please bear with us until then.

1 kudos

03-18-2024 8:58:06 AM

1 More Replies

by arkiboys • Contributor

03-18-2024 4:14:47 AM

1360 Views
1 replies
0 kudos

Resolved! can not set permission in table

In databricks database table I was able to set permissions to groups but Now I get this error when using a cluster:Error getting permissionssummary: SparkException: Trying to perform permission action on Hive Metastore /CATALOG/`hive_metastore`/DATAB...

Community Discussions

Reply

1360 Views
1 replies
0 kudos

03-18-2024 4:14:47 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

03-18-2024 6:50:26 AM

0 kudos

Hi @arkiboys, It seems you’re encountering an issue related to permissions and table access control in Databricks. Let’s troubleshoot this together. Table Access Control Not Enabled: The error message indicates that Table Access Control is not en...

0 kudos

03-18-2024 6:50:26 AM

by Faiçal_1979 • New Contributor

03-18-2024 6:03:14 AM

1378 Views
1 replies
0 kudos

Databricks and streamlit and fast API combination

hello friends ! i have project where i need databricks to train eval model then put it to productioni trained model & eval in databricks i used mlflow everything is good now i have another two steps that i have zeroclue how they should be done : usag...

Community Discussions

Reply

1378 Views
1 replies
0 kudos

03-18-2024 6:03:14 AM

View Replies

Latest Reply

RafiKurlansik
New Contributor III

03-18-2024 6:20:49 AM

0 kudos

This repo has examples that you can use in your Databricks workspace for FastAPI and Streamlit. I recommend only using these for development or lightweight use cases.

0 kudos

03-18-2024 6:20:49 AM

by johann_blake • New Contributor

03-08-2024 12:33:02 AM

1085 Views
2 replies
1 kudos

Databricks Repos

Hi everyone!I've set up an Azure cloud environment for the analytical team that I am part of and everythings is working wonderfully except Databricks Repos. Whenever we open Databricks, we find ourselves in the branch that the most recent person work...

Community Discussions

Reply

1085 Views
2 replies
1 kudos

03-08-2024 12:33:02 AM

View Replies

Latest Reply

feiyun0112
Contributor III

03-08-2024 1:08:45 AM

1 kudos

use a separate a Databricks Git folder mapped to a remote Git repo for each user who works in their own development branch .Run Git operations on Databricks Repos | Databricks on AWS

1 kudos

03-08-2024 1:08:45 AM

1 More Replies

by valjas • New Contributor III

03-15-2024 5:06:01 AM

515 Views
2 replies
0 kudos

How do I create spark.sql.session.SparkSession?

When I create a session n Databricks it is defaulting to spark.sql.connect.session.SparkSession. How can I connect to spark with out spark connect?

Community Discussions

Reply

515 Views
2 replies
0 kudos

03-15-2024 5:06:01 AM

View Replies

Latest Reply

MichTalebzadeh
Contributor III

03-15-2024 6:22:36 AM

0 kudos

The Spark Session is already created for by the Databricks environment. However you can create your ownfrom pyspark.sql import SparkSession # Initialize Spark session myspark = SparkSession.builder.appName("YourAppName").getOrCreate() # Create a sam...

0 kudos

03-15-2024 6:22:36 AM

1 More Replies

by Krubug • New Contributor

02-22-2024 12:01:58 AM

372 Views
1 replies
0 kudos

Improve Query Performance

HelloI have a query in one of my notebooks that took around 3.5 hours on D12_V2 cluster and workers between 5 to 25 .is there a way to write the query in diffrenet way in order to improve performance and cost : select /*+ BROADCAST(b) */ MD5(CONCAT(N...

Community Discussions

Reply

372 Views
1 replies
0 kudos

02-22-2024 12:01:58 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

03-15-2024 3:54:23 AM

0 kudos

Hi @Krubug, Optimizing SQL queries can significantly improve performance and reduce costs. Let’s explore some techniques to enhance the query you’ve provided: Minimize Wildcard Characters: The use of wildcard characters (such as % and _) in SQL ...

0 kudos

03-15-2024 3:54:23 AM

by Kaizen • Contributor III

02-21-2024 12:35:49 PM

603 Views
1 replies
1 kudos

Init script error 13.3 and 14.3 LTS issues mesa

Hi - we had a few issues with some of our init scripts recently. Investigating I found that mesa packages were throwing issues when trying to install. Posting this to help the community and raise awareness to Databricks to fix itI believe the image f...

Community Discussions

Reply

603 Views
1 replies
1 kudos

02-21-2024 12:35:49 PM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

03-15-2024 3:51:32 AM

1 kudos

Hi @Kaizen, Thank you for sharing your experience with the community! It’s essential to raise awareness about issues like this. Let’s dive into some troubleshooting steps and potential workarounds for the mesa packages installation problem. Conf...

1 kudos

03-15-2024 3:51:32 AM

by chemajar • New Contributor III

03-14-2024 6:32:43 AM

933 Views
1 replies
0 kudos

TASK_WRITE_FAILED when trying to write on the table, Databricks (Scala)

Hello,I have a code on Databricks (Scala) that constructs a df and then write it to a Database table. It is working fine for almost all of the tables, but there is a table with a problem. It says No module named 'delta.connect' - TASK_WRITE_FAILED.In...

Community Discussions

Databricks

Scala

Reply

933 Views
1 replies
0 kudos

03-14-2024 6:32:43 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

03-15-2024 2:31:40 AM

0 kudos

Hi @chemajar , It appears that you’re encountering a module import issue related to ‘delta.connect’ when writing data to a database table in Databricks using Scala. Let’s troubleshoot this together! The error message “ModuleNotFoundError: No modu...

0 kudos

03-15-2024 2:31:40 AM

by jenshumrich • New Contributor III

03-14-2024 5:42:48 AM

706 Views
4 replies
1 kudos

Not loading csv files with ".c000.csv" in the name

Yesterday I created a ton of csv files via joined_df.write.partitionBy("PartitionColumn").mode("overwrite").csv( output_path, header=True )Today, when working with them I realized, that they were not loaded. Upon investigation I saw...

Community Discussions

Reply

706 Views
4 replies
1 kudos

03-14-2024 5:42:48 AM

View Replies

Latest Reply

jenshumrich
New Contributor III

03-15-2024 1:26:17 AM

1 kudos

Then removing the "_commited_" file stops spark form reading in the other files

1 kudos

03-15-2024 1:26:17 AM

3 More Replies

by databricksdev • New Contributor II

03-14-2024 11:12:37 PM

806 Views
1 replies
1 kudos

Resolved! Is it possible to get Azure Databricks cluster metrics using REST API thru pyspark code

Am trying to get azure databricks cluster metrics such as memory utilization, CPU utilization, memory swap utilization, free file system using REST API by writing pyspark code. Its showing always cpu utilization & memory usage as N/A where as data...

Community Discussions

Reply

806 Views
1 replies
1 kudos

03-14-2024 11:12:37 PM

View Replies

Latest Reply

Ajay-Pandey
Esteemed Contributor III

03-15-2024 12:03:23 AM

1 kudos

Hi @databricksdev You can use System tables for Azure Databricks cluster metrics.Please refer below blog for the same -Compute system tables reference | Databricks on AWS

1 kudos

03-15-2024 12:03:23 AM

by 385653 • New Contributor II

02-09-2024 3:47:16 AM

5453 Views
7 replies
0 kudos

bigquery in notebook failing with unity catalog enabled cluster

bigquery(reading data from google cloud) failing with unity catalog enabled cluster. Same working fine without unity cluster. Any help is appreciated!Thanks,Sai

Community Discussions

Reply

5453 Views
7 replies
0 kudos

02-09-2024 3:47:16 AM

View Replies

Latest Reply

Srihasa_Akepati
New Contributor III

03-14-2024 3:52:09 AM

0 kudos

Hi @385653 It works from single user clusters using dbfs path. On Shared clusters, please set the spark conf at the notebook level where you would convert the json content into base64 string. This is a workaround as shared clusters do not support d...

0 kudos

03-14-2024 3:52:09 AM

6 More Replies

by Félix • New Contributor II

03-13-2024 9:22:48 AM

849 Views
1 replies
1 kudos

Resolved! Editor bug when escaping strings

When working in a notebook using %sql, when you escape a quote the editor colors get messed up.how it is:how it should be: I wont open a ticket or send a email to support.

Community Discussions

Reply

849 Views
1 replies
1 kudos

03-13-2024 9:22:48 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

03-14-2024 5:00:01 AM

1 kudos

Hi @Félix, Thank you for bringing this issue to our attention. We understand that the current behavior of the notebook editor when using %sql and escaping quotes can be frustrating. I will pass this feedback along to our development team so that they...

1 kudos

03-14-2024 5:00:01 AM

by DE-cat • New Contributor III

02-26-2024 12:36:43 PM

398 Views
1 replies
1 kudos

This job uses a format which has been deprecated since 2016

After creating a Databricks job using CLI v0.214.0 from a JSON input.I see the following message in the UI: "This job uses a format which has been deprecated since 2016, update it to dependent libraries automatically or learn more"When I update it, I...

Community Discussions

Reply

398 Views
1 replies
1 kudos

02-26-2024 12:36:43 PM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

03-14-2024 4:28:53 AM

1 kudos

Hi @DE-cat, The message you’re encountering about the deprecated format is related to the Databricks job configuration. Let’s break it down: Deprecated Format: The format used by your job has been considered outdated since 2016.Databricks recomm...

1 kudos

03-14-2024 4:28:53 AM

by Madalian • New Contributor III

03-13-2024 4:56:13 AM

451 Views
1 replies
0 kudos

Adding new field to Delta live Table

Hi Experts,I have on Bronze layer all delta merge files (Parquet) format.I am converting these files into delta live tables in silver layer. While doing so, I am unable to add current time stamp column.Following is the script:from pyspark.sql.functio...

Community Discussions

Reply

451 Views
1 replies
0 kudos

03-13-2024 4:56:13 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

03-14-2024 4:16:42 AM

0 kudos

Hi @Madalian, Let’s address the issue you’re facing while converting your Parquet files from the Bronze layer to Delta live tables in the Silver layer. Column Addition: It appears that you’re trying to add a current timestamp column named SILVER_...

0 kudos

03-14-2024 4:16:42 AM

Databricks Community

Forum Posts

Resolved! Understanding Spark Architecture during Table Creation

Corrupted Python installation on Python restart on DBR 13.3

Resolved! can not set permission in table

Databricks and streamlit and fast API combination

Databricks Repos

How do I create spark.sql.session.SparkSession?

Improve Query Performance

Init script error 13.3 and 14.3 LTS issues mesa

TASK_WRITE_FAILED when trying to write on the table, Databricks (Scala)

Not loading csv files with ".c000.csv" in the name

Resolved! Is it possible to get Azure Databricks cluster metrics using REST API thru pyspark code

bigquery in notebook failing with unity catalog enabled cluster

Resolved! Editor bug when escaping strings

This job uses a format which has been deprecated since 2016

Adding new field to Delta live Table

Gathering Data Off Of A PDF File

DLT’s

Editor bug when escaping strings

python multiprocessing hangs at map on one cluster...

Regarding Certification renewal process