Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
Hi all,I started using Azure Spot VMs by switching on the spot option when creating a cluster, however in the Azure billing dashboard, after some months of using spot instances, I only have OnDemand PurchaseType. Does someone guess what could be happ...
staticDataFrame = spark.read.format("csv")\ .option("header", "true").option("inferSchema", "true").load("/FileStore/tables/Consumption_2019/*.csv")
when above, I need an option to skip say first 4 lines on each CSV file, How do I do that?
The option... .option("skipRows", <number of rows to skip>) ...works for me as well. However, I am surprised that the official Spark doc does not list it as a CSV Data Source Option: https://spark.apache.org/docs/latest/sql-data-sources-csv.html#data...
Hi,we are exploring the use of Databricks Statement Execution API for sharing the data through API to different consumer applications, however we have a security requirement to configure TLS Mutual Authentication to limit the consumer application t...
Hello,I am trying to create a permanent UDF from a Python file with dependencies that are not part of the standard Python library.How do I make use of CREATE FUNCTION (External) [1] to create a permanent function in Databricks, using a Python file th...
I have an external location setup "auth_kafka" which is mapped to an abfss url:abfss://{container}@{account}.dfs.core.windows.net/auth/kafkaand, critically, is marked as readonly.Using dbutils.fs I can successfully read the files (i.e. the ls and hea...
I have a single user cluster and I have created a workflow which will read excel file from Azure storage account. For reading excel file I am using com.crealytics:spark-excel_2.13:3.4.1_0.19.0 library on single user all-purpose cluster.I have alread...
Hi @Retired_mod Can you ellaborate few more things:1. When spark-shell installs any maven package, what is the default location where it downloads the jar file ?2. As far as I know default location for jars is "/databricks/jars/" from where spark pic...
Only permissions I can see are select and this gives access to data and that is very unwanted. I only want users to see the metadata, like table/view/column names and descriptions/comments and location and such but not to see any data.
@Uma Maheswara Rao Desula​ , @Geeta Sai Boddu​ and @S S​ ,Thank you for the responses. I have gotten answer from Databricks and it seems this is not possible and this is something that is investigated as a capability.
I have facing a error when I am trying to read data from any MongoDB collection using MongoDB Spark Connector v10.x on Databricks v13.x.The below error appear to start at line #113 of MongoDB Spark Connector Library (v10.2.0): java.lang.NoSuchMethod...
I had the same issue, thanks for the info! Apparently it's also possible to fix it by removing all the actual notification in the interface (the bugged one is not displayed, but if you remove everything for some reason it removes the bugged one too)....
Hi All, Can someone please help me with the Python code to connect Azure SQL Database to Databricks using Service Principle instead of directly passing username and password. I'm using above code but getting above error. Refer Screenshot 2.Please hel...
First, you need to create a service principal in Azure and grant it the necessary permissions to access your Azure SQL Database to do crm data enrichment. You can do this using the Azure CLI or the Azure Portal. Ensure that your Databricks cluster ha...
Hi DB SupportI gave Databricks Certified Associate Engineering Exam today but missed by just by one percent. I got 68.88% and pass is 70%.I am planning to reattempt this exam in coming days and was hoping you could help, Could you kindly give me anot...
I recommend reaching out to Databricks directly or checking their official certification website for information on retake policies, voucher availability, and any discounts or promotions they may offer for reattempts.
In R notebook I am running: install.packages('fpp3', dependencies = TRUE) And getting back errors: ERROR: dependency ‘vctrs’ is not available for package ‘slider’I then install 'vctrs' and it again generates similar error that some package is...
Hi,How do I find the AWS associated costs from my databricks SQL warehouse usage? I tried using tags but they didn't show up in the AWS cost explorer.My use case is I am running some DBT - Databricks jobs and I want to find the cost for certain jobs....
Hello,I tried running a python UDF in a Delta Live Table workflow in Advanced mode but it did not run and gave the "Python UDF is not supported in your environment" error.Can I get a clear picture if the Python External UDFs are supported or not?
Hi @Retired_mod I ran this SQL query in my Catalog (I'm using Unity Catalog) :CREATE OR REPLACE FUNCTION cat_projint_dev.silver.GetEditor(prompt STRING)RETURNS STRINGLANGUAGE PYTHONAS $$print(prompt)$$ Then I ran a Delta Live Table workflow using Uni...
Hi!We have recently upgraded our cluster from Databricks Runtime 10.4 LTS which includes Apache Spark 3.2.1 to to Databricks Runtime 13.3 LTSincludes Apache Spark 3.2.1 powered by Apache Spark 3.3.0 and noticed that one of our jobs runtime has dramat...