cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

pernilak
by New Contributor III
  • 1522 Views
  • 0 replies
  • 0 kudos

Best practices for working with external locations where many files arrive constantly

I have an Azure Function that receives files (not volumes) and dumps them to cloud storage. One-five files are received approx. per second. I want to create a partitioned table in Databricks to work with. How should I do this? E.g.: register the cont...

  • 1522 Views
  • 0 replies
  • 0 kudos
sanjay
by Valued Contributor II
  • 6420 Views
  • 9 replies
  • 0 kudos

Performance issue while calling mlflow endpoint

Hi,I have pyspark dataframe and pyspark udf which calls mlflow model for each row but its performance is too slow.Here is sample codedef myfunc(input_text):   restult = mlflowmodel.predict(input_text)   return resultmyfuncUDF = udf(myfunc,StringType(...

  • 6420 Views
  • 9 replies
  • 0 kudos
Latest Reply
Isabeente
New Contributor II
  • 0 kudos

So good

  • 0 kudos
8 More Replies
Ramakrishnan83
by New Contributor III
  • 2399 Views
  • 1 replies
  • 0 kudos

Resolved! Understanding Spark Architecture during Table Creation

Team ,I am trying understand how the parquet files and JSON under the delta log folder stores the data behind the scenesTable Creation:from delta.tables import *DeltaTable.create(spark) \.tableName("employee") \.addColumn("id", "INT") \.addColumn("na...

Ramakrishnan83_0-1710772217666.png Ramakrishnan83_1-1710772318911.png Ramakrishnan83_2-1710772374126.png
  • 2399 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 0 kudos

@Ramakrishnan83  - Kindly go through the blog post - https://www.databricks.com/blog/2019/08/21/diving-into-delta-lake-unpacking-the-transaction-log.html which discuss in detail on delta's transaction log.

  • 0 kudos
pernilak
by New Contributor III
  • 2654 Views
  • 0 replies
  • 0 kudos

How to use external locations

Hi,I am struggling with truly understanding how to work with external locations. As far as I am able to read, you have:1) Managed catalogs2) Managed schemas3) Managed tables/volumes etc.4) External locations that contains external tables and/or volum...

  • 2654 Views
  • 0 replies
  • 0 kudos
Faiçal_1979
by New Contributor
  • 5860 Views
  • 1 replies
  • 0 kudos

Databricks and streamlit and fast API combination

hello friends ! i have project where i need databricks to train eval model then put it to productioni trained model & eval in databricks i used mlflow everything is good now i have another two steps that i have zeroclue how they should be done : usag...

  • 5860 Views
  • 1 replies
  • 0 kudos
Latest Reply
RafiKurlansik
Databricks Employee
  • 0 kudos

This repo has examples that you can use in your Databricks workspace for FastAPI and Streamlit.  I recommend only using these for development or lightweight use cases. 

  • 0 kudos
arkiboys
by Contributor
  • 2943 Views
  • 0 replies
  • 0 kudos

can not set permission in table

In databricks database table I was able to set permissions to groups but Now I get this error when using a cluster:Error getting permissionssummary: SparkException: Trying to perform permission action on Hive Metastore /CATALOG/`hive_metastore`/DATAB...

  • 2943 Views
  • 0 replies
  • 0 kudos
johann_blake
by New Contributor
  • 2045 Views
  • 2 replies
  • 1 kudos

Databricks Repos

Hi everyone!I've set up an Azure cloud environment for the analytical team that I am part of and everythings is working wonderfully except Databricks Repos. Whenever we open Databricks, we find ourselves in the branch that the most recent person work...

  • 2045 Views
  • 2 replies
  • 1 kudos
Latest Reply
feiyun0112
Honored Contributor
  • 1 kudos

 use a separate a Databricks Git folder mapped to a remote Git repo for each user who works in their own development branch .Run Git operations on Databricks Repos | Databricks on AWS

  • 1 kudos
1 More Replies
jenshumrich
by Contributor
  • 2027 Views
  • 4 replies
  • 1 kudos

Not loading csv files with ".c000.csv" in the name

Yesterday I created a ton of csv files via joined_df.write.partitionBy("PartitionColumn").mode("overwrite").csv(            output_path, header=True        )Today, when working with them I realized, that they were not loaded. Upon investigation I saw...

  • 2027 Views
  • 4 replies
  • 1 kudos
Latest Reply
jenshumrich
Contributor
  • 1 kudos

Then removing the "_commited_" file stops spark form reading in the other files

  • 1 kudos
3 More Replies
databricksdev
by New Contributor II
  • 2135 Views
  • 1 replies
  • 1 kudos

Resolved! Is it possible to get Azure Databricks cluster metrics using REST API thru pyspark code

Am trying to get azure  databricks cluster metrics such as memory utilization, CPU utilization, memory swap utilization, free file system using REST API by writing pyspark code. Its showing always cpu utilization & memory usage as N/A   where as data...

  • 2135 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 1 kudos

Hi @databricksdev  You can use System tables for Azure Databricks cluster metrics.Please refer below blog for the same -Compute system tables reference | Databricks on AWS

  • 1 kudos
chemajar
by New Contributor III
  • 3229 Views
  • 0 replies
  • 0 kudos

TASK_WRITE_FAILED when trying to write on the table, Databricks (Scala)

Hello,I have a code on Databricks (Scala) that constructs a df and then write it to a Database table. It is working fine for almost all of the tables, but there is a table with a problem. It says No module named 'delta.connect' - TASK_WRITE_FAILED.In...

chemajar_2-1710422710695.png chemajar_5-1710422843604.png
  • 3229 Views
  • 0 replies
  • 0 kudos
arkiboys
by Contributor
  • 2742 Views
  • 3 replies
  • 2 kudos

Resolved! reading workflow items

Hello, In databricks I have created workflows.in cmd prompt I can get a list of the workflows which look like the ones in dev environment.How can I get the list of workflows in test databricks environment?This is the command I use:databricks jobs lis...

  • 2742 Views
  • 3 replies
  • 2 kudos
Latest Reply
feiyun0112
Honored Contributor
  • 2 kudos

you need to config databricks CLI which host connectedhttps://learn.microsoft.com/en-us/azure/databricks/archive/dev-tools/cli/#set-up-authentication-using-a-databricks-personal-access-token 

  • 2 kudos
2 More Replies
M1NE
by New Contributor
  • 1980 Views
  • 2 replies
  • 0 kudos

Unable to create cluster in community edition

Hello, since yesterday it is impossible to start a cluster in the community version of Databricks. I have tried deleting it, creating a new one...Also, from what I see, it is an error that is happening to many people. Bootstrap Timeout:Node daemon pi...

  • 1980 Views
  • 2 replies
  • 0 kudos
Latest Reply
zach
New Contributor III
  • 0 kudos

I have the same issue in the Community Edition, has there been any response?

  • 0 kudos
1 More Replies
Blasti
by New Contributor II
  • 1224 Views
  • 1 replies
  • 0 kudos

Access AWS Resource In Another Account without STS

The EC2 instance profile I setup in the master AWS account can assume an S3/Dynamo access role in another S3 account. How do i setup in Databricks/AWS so that when I use Python Boto3 to access S3 and Dynamo without using STS to assume the role. 

  • 1224 Views
  • 1 replies
  • 0 kudos
Latest Reply
Blasti
New Contributor II
  • 0 kudos

Hey Kaniz, i am sorry about the confusion. I should have made my question more clear. I mean to access without using IAM assume role or access key as if the i am access resource within the same aws account.

  • 0 kudos
DjtheDE
by New Contributor
  • 1002 Views
  • 0 replies
  • 0 kudos

Queries Upgradation from HMS to UC

I am currently doing Queries upgradation from HMS to Unity catalog. I would like to know and understand a few best practices to update the queries and also use a 3-level namespace for the existing query structure. Please guide me!

  • 1002 Views
  • 0 replies
  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Top Kudoed Authors