cancel
Showing results for 
Search instead for 
Did you mean: 
Community Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Ramakrishnan83
by New Contributor III
  • 1597 Views
  • 1 replies
  • 0 kudos

Resolved! Understanding Spark Architecture during Table Creation

Team ,I am trying understand how the parquet files and JSON under the delta log folder stores the data behind the scenesTable Creation:from delta.tables import *DeltaTable.create(spark) \.tableName("employee") \.addColumn("id", "INT") \.addColumn("na...

Ramakrishnan83_0-1710772217666.png Ramakrishnan83_1-1710772318911.png Ramakrishnan83_2-1710772374126.png
  • 1597 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 0 kudos

@Ramakrishnan83  - Kindly go through the blog post - https://www.databricks.com/blog/2019/08/21/diving-into-delta-lake-unpacking-the-transaction-log.html which discuss in detail on delta's transaction log.

  • 0 kudos
ivanychev
by Contributor
  • 1452 Views
  • 2 replies
  • 1 kudos

Corrupted Python installation on Python restart on DBR 13.3

Hey there, we're using DBR 13.3 (no Docker) as general purpose cluster and init the cluster using the following init script:```#!/usr/bin/env bashexport DEBIAN_FRONTEND=noninteractiveset -euxo pipefailif [[ $DB_IS_DRIVER = "TRUE" ]]; thenecho "I am d...

  • 1452 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @ivanychev , Let me get some of our experts here at Databricks to answer your question. Please bear with us until then.

  • 1 kudos
1 More Replies
arkiboys
by Contributor
  • 1360 Views
  • 1 replies
  • 0 kudos

Resolved! can not set permission in table

In databricks database table I was able to set permissions to groups but Now I get this error when using a cluster:Error getting permissionssummary: SparkException: Trying to perform permission action on Hive Metastore /CATALOG/`hive_metastore`/DATAB...

  • 1360 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @arkiboys, It seems you’re encountering an issue related to permissions and table access control in Databricks. Let’s troubleshoot this together. Table Access Control Not Enabled: The error message indicates that Table Access Control is not en...

  • 0 kudos
Faiçal_1979
by New Contributor
  • 1378 Views
  • 1 replies
  • 0 kudos

Databricks and streamlit and fast API combination

hello friends ! i have project where i need databricks to train eval model then put it to productioni trained model & eval in databricks i used mlflow everything is good now i have another two steps that i have zeroclue how they should be done : usag...

  • 1378 Views
  • 1 replies
  • 0 kudos
Latest Reply
RafiKurlansik
New Contributor III
  • 0 kudos

This repo has examples that you can use in your Databricks workspace for FastAPI and Streamlit.  I recommend only using these for development or lightweight use cases. 

  • 0 kudos
johann_blake
by New Contributor
  • 1085 Views
  • 2 replies
  • 1 kudos

Databricks Repos

Hi everyone!I've set up an Azure cloud environment for the analytical team that I am part of and everythings is working wonderfully except Databricks Repos. Whenever we open Databricks, we find ourselves in the branch that the most recent person work...

  • 1085 Views
  • 2 replies
  • 1 kudos
Latest Reply
feiyun0112
Contributor III
  • 1 kudos

 use a separate a Databricks Git folder mapped to a remote Git repo for each user who works in their own development branch .Run Git operations on Databricks Repos | Databricks on AWS

  • 1 kudos
1 More Replies
valjas
by New Contributor III
  • 515 Views
  • 2 replies
  • 0 kudos

How do I create spark.sql.session.SparkSession?

When I create a session n Databricks it is defaulting to spark.sql.connect.session.SparkSession. How can I connect to spark with out spark connect?

  • 515 Views
  • 2 replies
  • 0 kudos
Latest Reply
MichTalebzadeh
Contributor III
  • 0 kudos

The Spark Session is already created for by the Databricks environment. However you can create your ownfrom pyspark.sql import SparkSession # Initialize Spark session myspark = SparkSession.builder.appName("YourAppName").getOrCreate() # Create a sam...

  • 0 kudos
1 More Replies
Krubug
by New Contributor
  • 372 Views
  • 1 replies
  • 0 kudos

Improve Query Performance

HelloI have a query in one of my notebooks that took around 3.5 hours on D12_V2 cluster and workers between 5 to 25 .is there a way to write the query in diffrenet way in order to improve performance and cost : select /*+ BROADCAST(b) */ MD5(CONCAT(N...

  • 372 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Krubug, Optimizing SQL queries can significantly improve performance and reduce costs. Let’s explore some techniques to enhance the query you’ve provided: Minimize Wildcard Characters: The use of wildcard characters (such as % and _) in SQL ...

  • 0 kudos
Kaizen
by Contributor III
  • 603 Views
  • 1 replies
  • 1 kudos

Init script error 13.3 and 14.3 LTS issues mesa

Hi - we had a few issues with some of our init scripts recently. Investigating I found that mesa packages were throwing issues when trying to install. Posting this to help the community and raise awareness to Databricks to fix itI believe the image f...

Kaizen_0-1708547649934.png Kaizen_1-1708547704572.png
  • 603 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Kaizen, Thank you for sharing your experience with the community! It’s essential to raise awareness about issues like this. Let’s dive into some troubleshooting steps and potential workarounds for the mesa packages installation problem. Conf...

  • 1 kudos
chemajar
by New Contributor III
  • 933 Views
  • 1 replies
  • 0 kudos

TASK_WRITE_FAILED when trying to write on the table, Databricks (Scala)

Hello,I have a code on Databricks (Scala) that constructs a df and then write it to a Database table. It is working fine for almost all of the tables, but there is a table with a problem. It says No module named 'delta.connect' - TASK_WRITE_FAILED.In...

chemajar_2-1710422710695.png chemajar_5-1710422843604.png
Community Discussions
Databricks
Scala
  • 933 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @chemajar , It appears that you’re encountering a module import issue related to ‘delta.connect’ when writing data to a database table in Databricks using Scala. Let’s troubleshoot this together! The error message “ModuleNotFoundError: No modu...

  • 0 kudos
jenshumrich
by New Contributor III
  • 706 Views
  • 4 replies
  • 1 kudos

Not loading csv files with ".c000.csv" in the name

Yesterday I created a ton of csv files via joined_df.write.partitionBy("PartitionColumn").mode("overwrite").csv(            output_path, header=True        )Today, when working with them I realized, that they were not loaded. Upon investigation I saw...

  • 706 Views
  • 4 replies
  • 1 kudos
Latest Reply
jenshumrich
New Contributor III
  • 1 kudos

Then removing the "_commited_" file stops spark form reading in the other files

  • 1 kudos
3 More Replies
databricksdev
by New Contributor II
  • 806 Views
  • 1 replies
  • 1 kudos

Resolved! Is it possible to get Azure Databricks cluster metrics using REST API thru pyspark code

Am trying to get azure  databricks cluster metrics such as memory utilization, CPU utilization, memory swap utilization, free file system using REST API by writing pyspark code. Its showing always cpu utilization & memory usage as N/A   where as data...

  • 806 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 1 kudos

Hi @databricksdev  You can use System tables for Azure Databricks cluster metrics.Please refer below blog for the same -Compute system tables reference | Databricks on AWS

  • 1 kudos
385653
by New Contributor II
  • 5453 Views
  • 7 replies
  • 0 kudos

bigquery in notebook failing with unity catalog enabled cluster

bigquery(reading data from google cloud) failing with unity catalog enabled cluster. Same working fine without unity cluster. Any help is appreciated!Thanks,Sai

  • 5453 Views
  • 7 replies
  • 0 kudos
Latest Reply
Srihasa_Akepati
New Contributor III
  • 0 kudos

Hi @385653  It works from single user clusters using dbfs path.  On Shared clusters, please set the spark conf at the notebook level where you would convert the json content into base64 string. This is a workaround as shared clusters do not support d...

  • 0 kudos
6 More Replies
Félix
by New Contributor II
  • 849 Views
  • 1 replies
  • 1 kudos

Resolved! Editor bug when escaping strings

When working in a notebook using %sql, when you escape a quote the editor colors get messed up.how it is:how it should be: I wont open a ticket or send a email to support.  

Flix_0-1710346621686.png Flix_2-1710346840743.png
  • 849 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Félix, Thank you for bringing this issue to our attention. We understand that the current behavior of the notebook editor when using %sql and escaping quotes can be frustrating. I will pass this feedback along to our development team so that they...

  • 1 kudos
DE-cat
by New Contributor III
  • 398 Views
  • 1 replies
  • 1 kudos

This job uses a format which has been deprecated since 2016

After creating a Databricks job using CLI v0.214.0 from a JSON input.I see the following message in the UI: "This job uses a format which has been deprecated since 2016, update it to dependent libraries automatically or learn more"When I update it, I...

  • 398 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @DE-cat, The message you’re encountering about the deprecated format is related to the Databricks job configuration. Let’s break it down: Deprecated Format: The format used by your job has been considered outdated since 2016.Databricks recomm...

  • 1 kudos
Madalian
by New Contributor III
  • 451 Views
  • 1 replies
  • 0 kudos

Adding new field to Delta live Table

Hi Experts,I have on Bronze layer all delta merge files (Parquet) format.I am converting these files into delta live tables in silver layer. While doing so, I am unable to add current time stamp column.Following is the script:from pyspark.sql.functio...

  • 451 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Madalian, Let’s address the issue you’re facing while converting your Parquet files from the Bronze layer to Delta live tables in the Silver layer. Column Addition: It appears that you’re trying to add a current timestamp column named SILVER_...

  • 0 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!