cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

raghunathr
by New Contributor III
  • 17736 Views
  • 2 replies
  • 4 kudos

Resolved! Benefits of Databricks Views vs Tables

Do we have any explicit benefits with Databricks Views when the view going to be a simple select of table?Does it improve performance by using views over tables?Giving access to views vs Tables?  

  • 17736 Views
  • 2 replies
  • 4 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 4 kudos

There can be several benefits to using Databricks views, even when the view is a simple select of a table:Improved query readability and maintainability:By encapsulating queries in views, you can simplify complex queries, making them more readable an...

  • 4 kudos
1 More Replies
ashish577
by New Contributor III
  • 3793 Views
  • 3 replies
  • 1 kudos

Any way to access unity catalog location through python/dbutils

I have a table created at unity catalog that was dropped, the files are not deleted due to the 30 day soft delete. Is there anyway to copy the files to a different location? When I try to use dbutils.fs.cp I get location overlap error with unity cata...

  • 3793 Views
  • 3 replies
  • 1 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 1 kudos

 You can use the dbutils.fs.mv command to move the files from the deleted table to a new location. Here's an example of how to do it: python# Define the pathssource_path = "dbfs:/mnt/<unity-catalog-location>/<database-name>/<table-name>"target_path =...

  • 1 kudos
2 More Replies
sarnendude
by New Contributor II
  • 4792 Views
  • 3 replies
  • 2 kudos

Unable to enable Databricks Assistant

Databricks Assistant is currently in Public Preview.As per below documentation, I have clicked 'Account Console' link to logins & enable Databricks Assistant but I am not getting "Settings" option at left side in admin console.Once I log in using Azu...

Data Engineering
databricksassistant
  • 4792 Views
  • 3 replies
  • 2 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 2 kudos

To enable Databricks Assistant, you need to navigate to the Admin Console in your Databricks workspace and follow these steps:Log in to your Databricks workspace using an account with workspace admin privileges.Click on the "Admin Console" icon in th...

  • 2 kudos
2 More Replies
User16783853501
by Databricks Employee
  • 4566 Views
  • 2 replies
  • 2 kudos

Using Delta Time Travel what is the scalability limit for using the feature, at what point does the time travel become infeasible?

Using Delta Time Travel what is the scalability limit for using the feature, at what point does the time travel become infeasible? 

  • 4566 Views
  • 2 replies
  • 2 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 2 kudos

The scalability limit for using Delta Time Travel depends on several factors, including the size of your Delta tables, the frequency of changes to the tables, and the retention periods for the Delta versions.In general, Delta Time Travel can become i...

  • 2 kudos
1 More Replies
ravi28
by New Contributor III
  • 27542 Views
  • 7 replies
  • 8 kudos

How to setup Job notifications using Microsoft Teams webhook ?

Couple of things I tried:1. I created a webhook connector in msft teams and copied it Notifications destinations via Admin page -> New destination -> from dropdown I selected Microsoft teams -> added webhook url and saved it.outcome: I don't get the ...

  • 27542 Views
  • 7 replies
  • 8 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 8 kudos

You can set up job notifications for Databricks jobs using Microsoft Teams webhooks by following these steps:Set up a Microsoft Teams webhook:Go to the channel where you want to receive notifications in Microsoft Teams.Click on the "..." icon next to...

  • 8 kudos
6 More Replies
bzh
by New Contributor
  • 6030 Views
  • 3 replies
  • 3 kudos

Large Data ingestion issue using auto loader

 The goal of this project is to ingest 1000+ files (100MB per file) from S3 into Databricks. Since this will be incremental changes, we are using Autoloader for continued ingestion and transformation using a cluster (i3.xlarge). The current process i...

  • 6030 Views
  • 3 replies
  • 3 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 3 kudos

 There are several possible ways to improve the performance of your Spark streaming job for ingesting a large volume of S3 files. Here are a few suggestions:Tune the spark.sql.shuffle.partitions config parameter:By default, the number of shuffle part...

  • 3 kudos
2 More Replies
elifa
by New Contributor II
  • 4004 Views
  • 3 replies
  • 1 kudos

DLT cloudfiles trigger interval not working

I have the following streaming table definition using cloudfiles format and pipelines.trigger.interval setting to reduce file discovery costs but the query is triggering every 12 seconds instead of every 5 minutes.Is there another configuration I am ...

Data Engineering
autloader
cloudFiles
dlt
trigger
  • 4004 Views
  • 3 replies
  • 1 kudos
Latest Reply
Tharun-Kumar
Databricks Employee
  • 1 kudos

@elifa Could you check for this message in the log file? INFO EnzymePlanner: Planning for flow: s3_dataAccording to the config pipelines.trigger.interval, the planning should happen once in every 5 minutes. 

  • 1 kudos
2 More Replies
pinaki1
by New Contributor III
  • 3672 Views
  • 3 replies
  • 2 kudos

databricks dashboard

how to download chart directly from databricks dashboard(not sql dashboard). download option is not available there, chart can be only downloaded from notebook

  • 3672 Views
  • 3 replies
  • 2 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 2 kudos

What exactly is your requirement

  • 2 kudos
2 More Replies
Zoumana
by New Contributor II
  • 21955 Views
  • 5 replies
  • 6 kudos

Resolved! How to get probability score for each prediction from mlflow

I trained my model and was able to get the batch prediction from that model as specified below. But I want to also get the probability scores for each prediction. Do you have any idea? Thank you!logged_model = path_to_model# Load model as a PyFuncMod...

  • 21955 Views
  • 5 replies
  • 6 kudos
Latest Reply
OndrejHavlicek
New Contributor III
  • 6 kudos

Now you can log the model using this parameter:mlflow.sklearn.log_model( ..., # the usual params pyfunc_predict_fn="predict_proba" ) which will return probabilities for the first class apparently when using the model for inference (e.g. when...

  • 6 kudos
4 More Replies
Chaitanya_Raju
by Honored Contributor
  • 6479 Views
  • 7 replies
  • 0 kudos
  • 6479 Views
  • 7 replies
  • 0 kudos
Latest Reply
Vartika
Databricks Employee
  • 0 kudos

Hi @Ratna Chaitanya Raju Bandaru​Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best? If not, please tell us so we can help you.Thanks!

  • 0 kudos
6 More Replies
CrisCampos
by New Contributor II
  • 6092 Views
  • 1 replies
  • 1 kudos

How to load a "pickle/joblib" file on Databricks

Hi Community, I am trying to load a joblib on Databricks, but doesn't seems to be working.Getting an error message: "Incompatible format detected"  Any idea of how to load this type of file on db?Thanks!

image image
  • 6092 Views
  • 1 replies
  • 1 kudos
Latest Reply
tapash-db
Databricks Employee
  • 1 kudos

You can import joblib/joblibspark package to load joblib files

  • 1 kudos
FerArribas
by Contributor
  • 5870 Views
  • 3 replies
  • 0 kudos

Resolved! Azure Databricks - Difference between protecting the WEB UI with IP Access list or disabling public access?

Hi, Thoroughly investigating the best security practices for accessing the Databricks WEB UI. I have doubts about the difference between protecting the WEB UI with (1) IP Access list (https://learn.microsoft.com/en-us/azure/databricks/security/networ...

  • 5870 Views
  • 3 replies
  • 0 kudos
Latest Reply
Rik
New Contributor III
  • 0 kudos

"In short, would it be the same to configure only the IP of the private endpoint in the IP access list vs disable public access?"The access list doesn't apply to private IPs, only to public IP (internet). Relevant part from the docs:"If you use Priva...

  • 0 kudos
2 More Replies
mbhakta
by New Contributor II
  • 1748 Views
  • 1 replies
  • 0 kudos

Dashboard - get value from table on user click

I'm building a dashboard via Python notebook and trying to allow the end user to click a value on a table, and use the selected value in another query / panel. This somewhat works using widget dropdowns for a user to select which value, but I'd reall...

  • 1748 Views
  • 1 replies
  • 0 kudos
Latest Reply
Henrymartin
New Contributor II
  • 0 kudos

@mbhakta wrote:I'm building a dashboard via Python notebook and trying to allow the end user to click a value on a table, and use the selected value in another query / panel. This somewhat works using widget dropdowns for a user to select which value...

  • 0 kudos
dream
by Contributor
  • 10488 Views
  • 1 replies
  • 2 kudos

Comparing schemas of two dataframes

So I was comparing schemas of two different dataframe using this code: >>> df1.schema == df2.schema Out: False But the thing is, both the schemas are completely equal.When digging deeper I realized that some of the StructFields() that should have bee...

  • 10488 Views
  • 1 replies
  • 2 kudos
Latest Reply
Ajay-Pandey
Databricks MVP
  • 2 kudos

Hi @dream ,In this case, you can go with dataframe.dtypes for comparing the schema or datatypes for two dataframeMetadata store information about column properties

  • 2 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels