Hi there,I need some help creating a new cluster policy that utilizes a single spot-instnace server to complete a job. I want to set this up as a job-compute to reduce costs and also utilize 1 spot instance.The jobs I need to ETL are very short and c...
Hi @Avkash Kana​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.
i am loading the 1billion data from spark dataframe into target table, but in the 7.3 cluster it takes 3 hours to complete but after migrated to 10.4 cluster its taking 8 hours to complete , how can i reduce the time duration​
Hi @Mohammed sadamusean​,Could you provide more details on what are you doing? What type of transformations/actions are you doing? whats your source and sink? batch or streaming? all that information will help.
I am trying to apply RLS to the solution but Power BI only connects to Databricks(DB) using a token which cant be used in DB groups. Is there no other way to apply Row Level security using Power BI?
Hi @Daniel Gomes​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.
It gives error[RequestId=5e57b66f-b69f-4e8b-8706-3fe5baeb77a0 ErrorClass=METASTORE_DOES_NOT_EXIST] No metastore assigned for the current workspace.using the following codespark.conf.set( "fs.azure.account.key.mystorageaccount.dfs.core.windows.net", ...
Hi @Farooq ur rehman​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.
What would be the best plan for independent course creator?Hi folks! I want to use databrick community edition as the platform to teach online courses. As you may know, for community edition, you need to create a new cluster when the old one terminat...
I have a requirement, where I need to apply inverse DQ rule on a table to track the invalid data. For which I can use the following approach:import dltrules = {}quarantine_rules = {}rules["valid_website"] = "(Website IS NOT NULL)"rules["valid_locatio...
You can get additional info from DLT event log which is in delta so you can load it as table https://docs.databricks.com/workflows/delta-live-tables/delta-live-tables-event-log.html#data-quality
Since databricks runtime 12.1 "WHEN NOT MATCHED BY SOURCE" was added to MERGE syntax. For example, using that option, we can quickly delete all target rows which doesn't match any source.
I'm trying to build gold level streaming live table based on two streaming silver live tables with left join.This attempt fails with the next error:"Append mode error: Stream-stream LeftOuter join between two streaming DataFrame/Datasets is not suppo...
Hello,I am currently attempting to setup a Git Repo within Azure Devops to use on my Azure Databricks Workspace environment for various notebooks. I went through the process of creating a Personal Access Token (PAT) on Devops, and have inputted the t...
Hi @Christopher Ackerman​, This error message usually occurs when there is an issue with authentication between Azure Databricks and Azure DevOps. One possible reason for this error is that the token was not granted the necessary permissions to acces...
Hi @peter dhaeseleer​, As of my current knowledge in February 2023, I am unaware of any official announcement from Databricks regarding the availability of DLT Unity Catalog integration in preview.
We are having Databricks Job running with main class and JAR file in it. Our JAR file code base is in Scala. Now, when our job starts running, we need to log Job ID and Run ID into the database for future purpose. How can we achieve this?
Here is a blog with code and examples on how to achieve this https://medium.com/@canadiandataguy/how-to-get-the-job-id-and-run-id-for-a-databricks-job-b0da484e66f5
Hello, i need to add a custom metadata into a avro file. The avro file containt data. we have tried to use "option" within the write function but it's not taken without generated any error.df.write.format("avro").option("avro.codec", "snappy").option...
Hi @zakaria belamri​, You can add custom metadata to an Avro file in PySpark by creating an Avro schema with the custom metadata fields and passing it to the DataFrameWriter as an option. Here's an example code snippet that demonstrates how to do thi...
We have just started working with databricks in one of my university modules, and the lecturers gave us a set of commands to practice saving data in the FileStore. One of the commands was the following:dbutils .fs.cp("/ databricks - datasets / weathh...
Hi ,I am not able to list the meta-stores in databricks cli using the below command :databricks unity-catalog metastores list{}but when I tried databricks unity-catalog metastores get-summary I am able to get the meta-store info .Can anyone help me ...
Hi @Kaniz Fatma​ , Unity catalog is enabled in my workspace and i have been assigned metastore admin and account admin also.databricks unity-catalog metastores list --debugHTTP debugging enabledsend: b'GET /api/2.1/unity-catalog/metastores HTTP/1.1\r...