Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
Hi,I have pyspark dataframe which calls pyspark udf which in turn calls sagemaker endpoint. But when dataframe has more rows, endpoint start failing. Also it takes longer to process.Please suggest how to call sagemaker endpoint from pyspark.Regards,S...
Sometimes getting this kind of error "org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 12224.0 failed 4 times, most recent failure: Lost task 1.5 in stage 12224.0 (TID ) (12.xxx.x.xxx executor 1): com.datab...
I created a policy for users to use when they create their own Job clusters. When I'm editing the policy, I don't have the UI options for adding library (I can only see Definitions and Permissions tabs). I need to add via JSON the option to allows th...
@adrianhernandez are you admin to workspace, if not you might be missing permissions, if you have policies enabled, admin can allow you.https://docs.databricks.com/en/administration-guide/clusters/policies.html#librariesif your workspace is Unity cat...
Hi, I have come across this piece of documentation:Databricks does not support running Spark jobs from the web terminal. In addition, Databricks web terminal is not available in the following cluster types:Job clustersClusters launched with the DISAB...
Hello, I have created a sample java UDF which masks few characters of a string. However I facing couple of issues when uploading and using it.First I could only import it, which for now is OK. But when do the following,create function udf_mask as 'ba...
Hi!I'm experiencing different behaviours between two DBX Workspaces when trying to list file contents from an abfss: location.In workspace A running len(dbutils.fs.ls('abfss://~~@~~~~.dfs.core.windows.net/~~/')) results in "Out[1]: 1551", while runni...
Join your peers at the Data + AI World Tour 2023! Explore the latest advancements, hear real-world case studies and discover best practices that deliver data and AI transformation. From the Databricks Lakehouse Platform to open source technologies in...
Introducing Mini Flush: Your Ticket to Ultimate Casino Thrills!Are you ready to embark on an electrifying journey into the world of online gambling? If so, look no further than Vijaybet Online Casino! Our state-of-the-art platform is your gateway to ...
I am testing Databricks with non-AWS S3 object storage. I can access the non-AWS S3 bucket by setting these parameters:sc._jsc.hadoopConfiguration().set("fs.s3a.access.key", "XXXXXXXXXXXXXXXXXXXX")sc._jsc.hadoopConfiguration().set("fs.s3a.secret.key...
Hello,I am trying to use the getArgument() function in a spark.sql query. It works fine if I run the notebook via an interactive cluster, but gives an error when executed via a job run in an instance Pool.query:OPTIMIZE <table>where date = replace(re...
Background info:1. We have unity catalog enabled. 2. All of our jobs are run by Service Principal that has all necessary access it needs.Issue:One of the jobs checks existing schemas against the ones it is supposed to create in that given run and if ...
@AH that depends on use case, if your implementation involves Data Lake, ML, Data engineering tasks better to go with databricks as it has got good UI and there good governance using unity catalog for your data lake and you have good consumer tool su...
Hello experts,Could someone please explain what is exactly contained into the column usage in the system.billing.usage table?We ran specific queries in a cluster trying to calculate the cost and we observe that the DBUs shown in the system table are ...
@elgeo both should be same, untill if somehow we miss to pick proper plan DBU price, usage column will have complete information related to sku name and DBU units etc... if you use azure databricks calculator and compare we should see similar result
We are currently starting to build certain data pipelines using Databricks.For this we use Jobs and the steps in these Jobs are implemented in Python Wheels.We are able to retrieve the Job ID, Job Run ID and Task Run Id in our Python Wheels from the ...