cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Kash
by Contributor III
  • 1227 Views
  • 4 replies
  • 0 kudos

Creating a spot only single-node job compute cluster policy

Hi there,I need some help creating a new cluster policy that utilizes a single spot-instnace server to complete a job. I want to set this up as a job-compute to reduce costs and also utilize 1 spot instance.The jobs I need to ETL are very short and c...

  • 1227 Views
  • 4 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

Hi @Avkash Kana​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 0 kudos
3 More Replies
databicky
by Contributor II
  • 1063 Views
  • 4 replies
  • 0 kudos

how to optimize the runtime in 10.4 cluster

i am loading the 1billion data from spark dataframe into target table, but in the 7.3 cluster it takes 3 hours to complete but after migrated to 10.4 cluster its taking 8 hours to complete , how can i reduce the time duration​

  • 1063 Views
  • 4 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

Hi @Mohammed sadamusean​,Could you provide more details on what are you doing? What type of transformations/actions are you doing? whats your source and sink? batch or streaming? all that information will help.

  • 0 kudos
3 More Replies
RafikiT97
by New Contributor
  • 2107 Views
  • 3 replies
  • 0 kudos

Query Databricks from Power BI with Row Level Security

I am trying to apply RLS to the solution but Power BI only connects to Databricks(DB) using a token which cant be used in DB groups. Is there no other way to apply Row Level security using Power BI?

  • 2107 Views
  • 3 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

Hi @Daniel Gomes​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 0 kudos
2 More Replies
farooqurrehman
by New Contributor
  • 1152 Views
  • 3 replies
  • 2 kudos

Unable to connect/read files from ADLS Gen2 using account key

It gives error[RequestId=5e57b66f-b69f-4e8b-8706-3fe5baeb77a0 ErrorClass=METASTORE_DOES_NOT_EXIST] No metastore assigned for the current workspace.using the following codespark.conf.set(  "fs.azure.account.key.mystorageaccount.dfs.core.windows.net", ...

  • 1152 Views
  • 3 replies
  • 2 kudos
Latest Reply
jose_gonzalez
Moderator
  • 2 kudos

Hi @Farooq ur rehman​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 2 kudos
2 More Replies
Lizhi_Dong
by New Contributor II
  • 759 Views
  • 3 replies
  • 0 kudos

Tables disappear when I re-start a new cluster on Community Edition

What would be the best plan for independent course creator?Hi folks! I want to use databrick community edition as the platform to teach online courses. As you may know, for community edition, you need to create a new cluster when the old one terminat...

  • 759 Views
  • 3 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

Hi @Lizhi Dong​,This might be a limitation from Community Edition. When your cluster gets terminated all your tables will be removed.

  • 0 kudos
2 More Replies
SRK
by Contributor III
  • 1460 Views
  • 5 replies
  • 0 kudos

Delta Live Tables data quality rules application.

I have a requirement, where I need to apply inverse DQ rule on a table to track the invalid data. For which I can use the following approach:import dltrules = {}quarantine_rules = {}rules["valid_website"] = "(Website IS NOT NULL)"rules["valid_locatio...

  • 1460 Views
  • 5 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

You can get additional info from DLT event log which is in delta so you can load it as table https://docs.databricks.com/workflows/delta-live-tables/delta-live-tables-event-log.html#data-quality

  • 0 kudos
4 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 419 Views
  • 1 replies
  • 10 kudos

Since databricks runtime 12.1 "WHEN NOT MATCHED BY SOURCE" was added to MERGE syntax. For example, using that option, we can quickly delete ...

Since databricks runtime 12.1 "WHEN NOT MATCHED BY SOURCE" was added to MERGE syntax. For example, using that option, we can quickly delete all target rows which doesn't match any source.

Screenshot 2023-01-24 130504
  • 419 Views
  • 1 replies
  • 10 kudos
Latest Reply
jose_gonzalez
Moderator
  • 10 kudos

Thank you for sharing @Hubert Dudek​ 

  • 10 kudos
Daba
by New Contributor III
  • 3794 Views
  • 5 replies
  • 5 kudos

DLT streaming table and LEFT JOIN

I'm trying to build gold level streaming live table based on two streaming silver live tables with left join.This attempt fails with the next error:"Append mode error: Stream-stream LeftOuter join between two streaming DataFrame/Datasets is not suppo...

  • 3794 Views
  • 5 replies
  • 5 kudos
Latest Reply
Daba
New Contributor III
  • 5 kudos

Thanks Fatma,I do understand the need for watermarks, but I'm just wondering if this supported by SQL syntax?

  • 5 kudos
4 More Replies
ackerman_chris
by New Contributor III
  • 1704 Views
  • 1 replies
  • 1 kudos

Resolved! Azure Devops Git sync failed in Azure Databricks

Hello,I am currently attempting to setup a Git Repo within Azure Devops to use on my Azure Databricks Workspace environment for various notebooks. I went through the process of creating a Personal Access Token (PAT) on Devops, and have inputted the t...

  • 1704 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Christopher Ackerman​, This error message usually occurs when there is an issue with authentication between Azure Databricks and Azure DevOps. One possible reason for this error is that the token was not granted the necessary permissions to acces...

  • 1 kudos
phaezel
by New Contributor
  • 592 Views
  • 1 replies
  • 0 kudos
  • 592 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @peter dhaeseleer​, As of my current knowledge in February 2023, I am unaware of any official announcement from Databricks regarding the availability of DLT Unity Catalog integration in preview.

  • 0 kudos
Mohit_m
by Valued Contributor II
  • 13908 Views
  • 2 replies
  • 4 kudos

Resolved! How to get the Job ID and Run ID and save into a database

We are having Databricks Job running with main class and JAR file in it. Our JAR file code base is in Scala. Now, when our job starts running, we need to log Job ID and Run ID into the database for future purpose. How can we achieve this?

  • 13908 Views
  • 2 replies
  • 4 kudos
Latest Reply
User16783853961
New Contributor II
  • 4 kudos

Here is a blog with code and examples on how to achieve this https://medium.com/@canadiandataguy/how-to-get-the-job-id-and-run-id-for-a-databricks-job-b0da484e66f5

  • 4 kudos
1 More Replies
zak
by New Contributor II
  • 2496 Views
  • 4 replies
  • 4 kudos

Resolved! add custom metadata to avro file with pyspark

Hello, i need to add a custom metadata into a avro file. The avro file containt data. we have tried to use "option" within the write function but it's not taken without generated any error.df.write.format("avro").option("avro.codec", "snappy").option...

  • 2496 Views
  • 4 replies
  • 4 kudos
Latest Reply
Kaniz
Community Manager
  • 4 kudos

Hi @zakaria belamri​, You can add custom metadata to an Avro file in PySpark by creating an Avro schema with the custom metadata fields and passing it to the DataFrameWriter as an option. Here's an example code snippet that demonstrates how to do thi...

  • 4 kudos
3 More Replies
JordanYaker
by Contributor
  • 963 Views
  • 3 replies
  • 3 kudos

Resolved! What is the maximum number of workspaces per account using Databricks on AWS?

I've been looking through the documentation and I swear this used to be listed somewhere, but for the life of me I can't find it anymore.

  • 963 Views
  • 3 replies
  • 3 kudos
Latest Reply
JordanYaker
Contributor
  • 3 kudos

Thanks @Kaniz Fatma​ 

  • 3 kudos
2 More Replies
kkawka1
by New Contributor III
  • 3314 Views
  • 8 replies
  • 10 kudos

Resolved! Removing files saved in the root FileStore

We have just started working with databricks in one of my university modules, and the lecturers gave us a set of commands to practice saving data in the FileStore. One of the commands was the following:dbutils .fs.cp("/ databricks - datasets / weathh...

  • 3314 Views
  • 8 replies
  • 10 kudos
Latest Reply
Kaniz
Community Manager
  • 10 kudos

Hi @Konrad Kawka​ , Are you using the Community Edition?

  • 10 kudos
7 More Replies
RamaTeja
by New Contributor II
  • 1068 Views
  • 3 replies
  • 2 kudos

Unity Catalog metastore list is showing empty

Hi ,I am not able to list the meta-stores in databricks cli using the below command :databricks unity-catalog metastores list{}but when I tried databricks unity-catalog metastores get-summary I am able to get the meta-store info .Can anyone help me ...

  • 1068 Views
  • 3 replies
  • 2 kudos
Latest Reply
RamaTeja
New Contributor II
  • 2 kudos

Hi @Kaniz Fatma​ , Unity catalog is enabled in my workspace and i have been assigned metastore admin and account admin also.databricks unity-catalog metastores list --debugHTTP debugging enabledsend: b'GET /api/2.1/unity-catalog/metastores HTTP/1.1\r...

  • 2 kudos
2 More Replies
Labels
Top Kudoed Authors