cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Faiçal_1979
by New Contributor
  • 3443 Views
  • 1 replies
  • 0 kudos

Databricks and streamlit and fast API combination

hello friends ! i have project where i need databricks to train eval model then put it to productioni trained model & eval in databricks i used mlflow everything is good now i have another two steps that i have zeroclue how they should be done : usag...

  • 3443 Views
  • 1 replies
  • 0 kudos
Latest Reply
RafiKurlansik
New Contributor III
  • 0 kudos

This repo has examples that you can use in your Databricks workspace for FastAPI and Streamlit.  I recommend only using these for development or lightweight use cases. 

  • 0 kudos
johann_blake
by New Contributor
  • 1477 Views
  • 2 replies
  • 1 kudos

Databricks Repos

Hi everyone!I've set up an Azure cloud environment for the analytical team that I am part of and everythings is working wonderfully except Databricks Repos. Whenever we open Databricks, we find ourselves in the branch that the most recent person work...

  • 1477 Views
  • 2 replies
  • 1 kudos
Latest Reply
feiyun0112
Honored Contributor
  • 1 kudos

 use a separate a Databricks Git folder mapped to a remote Git repo for each user who works in their own development branch .Run Git operations on Databricks Repos | Databricks on AWS

  • 1 kudos
1 More Replies
Krubug
by New Contributor
  • 739 Views
  • 1 replies
  • 0 kudos

Improve Query Performance

HelloI have a query in one of my notebooks that took around 3.5 hours on D12_V2 cluster and workers between 5 to 25 .is there a way to write the query in diffrenet way in order to improve performance and cost : select /*+ BROADCAST(b) */ MD5(CONCAT(N...

  • 739 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Krubug, Optimizing SQL queries can significantly improve performance and reduce costs. Let’s explore some techniques to enhance the query you’ve provided: Minimize Wildcard Characters: The use of wildcard characters (such as % and _) in SQL ...

  • 0 kudos
Kaizen
by Valued Contributor
  • 884 Views
  • 1 replies
  • 1 kudos

Init script error 13.3 and 14.3 LTS issues mesa

Hi - we had a few issues with some of our init scripts recently. Investigating I found that mesa packages were throwing issues when trying to install. Posting this to help the community and raise awareness to Databricks to fix itI believe the image f...

Kaizen_0-1708547649934.png Kaizen_1-1708547704572.png
  • 884 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Kaizen, Thank you for sharing your experience with the community! It’s essential to raise awareness about issues like this. Let’s dive into some troubleshooting steps and potential workarounds for the mesa packages installation problem. Conf...

  • 1 kudos
chemajar
by New Contributor III
  • 1917 Views
  • 1 replies
  • 0 kudos

TASK_WRITE_FAILED when trying to write on the table, Databricks (Scala)

Hello,I have a code on Databricks (Scala) that constructs a df and then write it to a Database table. It is working fine for almost all of the tables, but there is a table with a problem. It says No module named 'delta.connect' - TASK_WRITE_FAILED.In...

chemajar_2-1710422710695.png chemajar_5-1710422843604.png
  • 1917 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @chemajar , It appears that you’re encountering a module import issue related to ‘delta.connect’ when writing data to a database table in Databricks using Scala. Let’s troubleshoot this together! The error message “ModuleNotFoundError: No modu...

  • 0 kudos
jenshumrich
by Contributor
  • 1197 Views
  • 4 replies
  • 1 kudos

Not loading csv files with ".c000.csv" in the name

Yesterday I created a ton of csv files via joined_df.write.partitionBy("PartitionColumn").mode("overwrite").csv(            output_path, header=True        )Today, when working with them I realized, that they were not loaded. Upon investigation I saw...

  • 1197 Views
  • 4 replies
  • 1 kudos
Latest Reply
jenshumrich
Contributor
  • 1 kudos

Then removing the "_commited_" file stops spark form reading in the other files

  • 1 kudos
3 More Replies
databricksdev
by New Contributor II
  • 1318 Views
  • 1 replies
  • 1 kudos

Resolved! Is it possible to get Azure Databricks cluster metrics using REST API thru pyspark code

Am trying to get azure  databricks cluster metrics such as memory utilization, CPU utilization, memory swap utilization, free file system using REST API by writing pyspark code. Its showing always cpu utilization & memory usage as N/A   where as data...

  • 1318 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 1 kudos

Hi @databricksdev  You can use System tables for Azure Databricks cluster metrics.Please refer below blog for the same -Compute system tables reference | Databricks on AWS

  • 1 kudos
385653
by New Contributor II
  • 10611 Views
  • 7 replies
  • 0 kudos

bigquery in notebook failing with unity catalog enabled cluster

bigquery(reading data from google cloud) failing with unity catalog enabled cluster. Same working fine without unity cluster. Any help is appreciated!Thanks,Sai

  • 10611 Views
  • 7 replies
  • 0 kudos
Latest Reply
Srihasa_Akepati
New Contributor III
  • 0 kudos

Hi @385653  It works from single user clusters using dbfs path.  On Shared clusters, please set the spark conf at the notebook level where you would convert the json content into base64 string. This is a workaround as shared clusters do not support d...

  • 0 kudos
6 More Replies
Félix
by New Contributor II
  • 1244 Views
  • 1 replies
  • 1 kudos

Resolved! Editor bug when escaping strings

When working in a notebook using %sql, when you escape a quote the editor colors get messed up.how it is:how it should be: I wont open a ticket or send a email to support.  

Flix_0-1710346621686.png Flix_2-1710346840743.png
  • 1244 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Félix, Thank you for bringing this issue to our attention. We understand that the current behavior of the notebook editor when using %sql and escaping quotes can be frustrating. I will pass this feedback along to our development team so that they...

  • 1 kudos
DE-cat
by New Contributor III
  • 616 Views
  • 1 replies
  • 1 kudos

This job uses a format which has been deprecated since 2016

After creating a Databricks job using CLI v0.214.0 from a JSON input.I see the following message in the UI: "This job uses a format which has been deprecated since 2016, update it to dependent libraries automatically or learn more"When I update it, I...

  • 616 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @DE-cat, The message you’re encountering about the deprecated format is related to the Databricks job configuration. Let’s break it down: Deprecated Format: The format used by your job has been considered outdated since 2016.Databricks recomm...

  • 1 kudos
Madalian
by New Contributor III
  • 1050 Views
  • 1 replies
  • 0 kudos

Adding new field to Delta live Table

Hi Experts,I have on Bronze layer all delta merge files (Parquet) format.I am converting these files into delta live tables in silver layer. While doing so, I am unable to add current time stamp column.Following is the script:from pyspark.sql.functio...

  • 1050 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Madalian, Let’s address the issue you’re facing while converting your Parquet files from the Bronze layer to Delta live tables in the Silver layer. Column Addition: It appears that you’re trying to add a current timestamp column named SILVER_...

  • 0 kudos
nachog99
by New Contributor II
  • 863 Views
  • 1 replies
  • 0 kudos

Read VCF files using latest runtime version

Hello everyone!I was reading VCF files using the glow library (Maven: io.projectglow:glow-spark3_2.12:1.2.1).The last version of this library only works with the spark's version 3.3.2 so if I need to use a newer runtime with a more recent spark versi...

  • 863 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @nachog99, Working with VCF (Variant Call Format) files in Spark can be done using various approaches. Let’s explore some options: Glow Library: You’ve already been using the Glow library (Maven: io.projectglow:glow-spark3_2.12:1.2.1) to read ...

  • 0 kudos
JonLaRose
by New Contributor III
  • 807 Views
  • 1 replies
  • 0 kudos

Unity Catalog external table: Delta Lake table comment

Hi there,When creating an external table in Unity Catalog using an existing Delta Lake table with a comment on the table itself, the comment isn't imported to the `Comment` key's value in the Unity Catalog table.Could you explain why?

  • 807 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @JonLaRose, When you create an external table in Unity Catalog by referencing an existing Delta Lake table, the behavior you’ve observed is indeed expected. Let’s delve into the reasons behind this: Unity Catalog and Delta Lake: Unity Catalog...

  • 0 kudos
LiLO
by New Contributor
  • 1182 Views
  • 1 replies
  • 0 kudos

Transform a file into a bytes format without using BlobServiceClient.

Hello everyone,I would like to know if it was possible to transform, with PySpark, a flat file stored in a directory in Azure Blob storage into bytes format to be able to parse it, while using the connection already integrated into the cluster betwee...

  • 1182 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @LiLO, You can achieve this using PySpark and the integrated connection between Databricks and Azure Blob storage. Let’s break down the steps: Read the Flat File from Azure Blob Storage: Use the integrated connection to read the flat file dire...

  • 0 kudos
manoj_2355ca
by New Contributor III
  • 475 Views
  • 1 replies
  • 0 kudos

Security Analysis Tool - pat token extension or changes

HI TeamI have setup the SAT for my workspace . Do i need to change the PAT token everytime it expires or is there another workaround . How to change the pat token for for already established sat ?

  • 475 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @manoj_2355ca, To manage your Personal Access Tokens (PATs) effectively, here are some key points: Expiration of PATs: PATs have an expiration date, but the good news is that if you regenerate or remove a PAT, your deployment groups will conti...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Top Kudoed Authors