cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Chris_Shehu
by Valued Contributor III
  • 6244 Views
  • 1 replies
  • 1 kudos

Are there Visio stencils for Databricks?

I need to create some Visio's to show different processes in Databricks. Microsoft offers a Azure pack but it doesn't include Databricks.

  • 6244 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Christopher Shehu​ - I am so sorry. I don't know how your question slid under my radar. Hang in there. I'll try to find someone to answer your question as quickly as possible. Thank you for your patience!

  • 1 kudos
aksharamaham
by New Contributor
  • 1754 Views
  • 1 replies
  • 0 kudos

Delta Live Table - How to get details of which records were excluded in Quality Checks?

I've been experimenting with DLT and it works well. I'd like to understand where can I see details of which records didn't meet the quality critera?

  • 1754 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hello, @Paresh J​! Welcome and thank you for asking! My name is Piper, and I'm a moderator for Databricks.Let's give the community some time to help before we circle back to you. Thanks in advance for your patience.

  • 0 kudos
Situs_UG300_Off
by New Contributor
  • 434 Views
  • 0 replies
  • 0 kudos

res.cloudinary.com

Link UG300 ada menyediakan depo tipe e- wallet yang dapat dipakai unyuk dapat melaksanakan pembelian ataupun top up saldo ke e- wallet tujuan yang telah ada di dalam web. Adanya berita gembira buat kalian yang tidak mempunyai rekening bank, Jika kali...

  • 434 Views
  • 0 replies
  • 0 kudos
Ravi1979
by New Contributor
  • 2308 Views
  • 1 replies
  • 0 kudos
  • 2308 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hello, @Ravi Param​ - My name is Piper, and I'm one of the moderators here. Thank you for your question! Let's give the community a chance to respond and then we'll circle back if necessary.

  • 0 kudos
frank26364
by New Contributor III
  • 12855 Views
  • 4 replies
  • 0 kudos

Resolved! Command prompt won't let me type the Databricks token

Hi, I am trying to set up Databricks CLI using the command prompt on my computer. I downloaded the Python 3.9 app and successfully ran the command pip install databricks-cliWhen I try to set up the Databricks token, I am able to type my Databricks Ho...

  • 12855 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hey there! You're on a roll today! Thanks for letting us know.

  • 0 kudos
3 More Replies
frank26364
by New Contributor III
  • 35835 Views
  • 5 replies
  • 4 kudos

Resolved! Export Databricks results to Blob in a csv file

Hello everyone,I want to export my data from Databricks to the blob. My Databricks commands select some pdf from my blob, run Form Recognizer and export the output results in my blob. Here is the code: %pip install azure.storage.blob %pip install...

  • 35835 Views
  • 5 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

@Francis Bouliane​ - Thank you for sharing the solution.

  • 4 kudos
4 More Replies
William_Scardua
by Valued Contributor
  • 4959 Views
  • 2 replies
  • 0 kudos

Resolved! how to Intercept Spark Listener with Pyspark ?

hi guys,​It`s possible to intercept Spark Listener with Pyspark to collect indicator like shuffle, skew ratio, etc ?

  • 4959 Views
  • 2 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

interesting question.I know that you can use the SparkListener to collect info, f.e. here.Mind that the class is written in Scala, so my first thought was that it is not possible in python/pyspark.But SO says it is possible, but with a lot of overhea...

  • 0 kudos
1 More Replies
BorislavBlagoev
by Valued Contributor III
  • 2732 Views
  • 2 replies
  • 4 kudos

Resolved! Converting dataframe to delta.

Is it possible to convert the dataframe to a delta table without saving the dataframe on the storage?

  • 2732 Views
  • 2 replies
  • 4 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 4 kudos

no, it will only be a delta table when writing it.

  • 4 kudos
1 More Replies
bluetail
by Contributor
  • 14910 Views
  • 6 replies
  • 5 kudos

Resolved! ModuleNotFoundError: No module named 'mlflow' when running a notebook

I am running a notebook on the Coursera platform.my configuration file, Classroom-Setup, looks like this:%python   spark.conf.set("com.databricks.training.module-name", "deep-learning") spark.conf.set("com.databricks.training.expected-dbr", "6.4")   ...

  • 14910 Views
  • 6 replies
  • 5 kudos
Latest Reply
User16753724663
Valued Contributor
  • 5 kudos

Hi @Maria Bruevich​ ,From the error description, it looks like the mlflow library is not present. You can use ML cluster as these type of cluster already have mlflow library. Please check the below document:https://docs.databricks.com/release-notes/r...

  • 5 kudos
5 More Replies
DanVartanian
by New Contributor II
  • 6159 Views
  • 3 replies
  • 0 kudos

Resolved! Help trying to calculate a percentage

The image below shows what my source data is (HAVE) and what I'm trying to get to (WANT).I want to be able to calculate the percentage of bad messages (where formattedMessage = false) by source and date.I'm not sure how to achieve this in DatabricksS...

havewant
  • 6159 Views
  • 3 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

you could use a windows function over source and date with a sum of messagecount. This gives you the total per source/date repeated on every line.Then apply a filter on formattedmessage == false and divide messagecount by the sum above.

  • 0 kudos
2 More Replies
SettlerOfCatan
by New Contributor
  • 4805 Views
  • 0 replies
  • 0 kudos

Access data within the blob storage without downloading

Our customer is using Azure’s blob storage service to save big files so that we can work with them using an Azure online service, like Databricks.We want to read and work with these files with a computing resource obtained by Azure directly without d...

blob-storage Azure-ML fileytypes blob
  • 4805 Views
  • 0 replies
  • 0 kudos
Azure_Data_Eng1
by New Contributor
  • 566 Views
  • 0 replies
  • 0 kudos

data=[['x', 20220118, 'FALSE', 3],['x', 20220118, 'TRUE', 97],['x', 20220119, 'FALSE', 1],['x'...

data=[['x', 20220118, 'FALSE', 3],['x', 20220118, 'TRUE', 97],['x', 20220119, 'FALSE', 1],['x', 20220119, 'TRUE', 49],['Y', 20220118, 'FALSE', 100],['Y', 20220118, 'TRUE', 900],['Y', 20220119, 'FALSE', 200],['Y', 20220119, 'TRUE', 800]]df=spark.creat...

  • 566 Views
  • 0 replies
  • 0 kudos
Soma
by Valued Contributor
  • 2233 Views
  • 3 replies
  • 2 kudos

Resolved! Query RestAPI end point in Databricks Standard Workspace

Do we have option to query delta table using Standard Workspace as a endpoint instead of JDBC

  • 2233 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@somanath Sankaran​ - Would you be happy to mark @Hubert Dudek​'s answer as best if it resolved the problem? That helps other members who are searching for answers find the solution more quickly.

  • 2 kudos
2 More Replies
MattM
by New Contributor III
  • 3622 Views
  • 4 replies
  • 4 kudos

Resolved! Schema Parsing issue when datatype of source field is mapped incorrect

I have complex json file which has massive struct column. We regularly have issues when we try to parse this json file by forming our case class to extract the fields from schema. With this approach the issue we are facing is that if one data type of...

  • 3622 Views
  • 4 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hey there, @Matt M​ - If @Hubert Dudek​'s response solved the issue, would you be happy to mark his answer as best? It helps other members find the solution more quickly.

  • 4 kudos
3 More Replies
BorislavBlagoev
by Valued Contributor III
  • 4854 Views
  • 9 replies
  • 3 kudos

Resolved! Tring to create incremental pipeline but fails when I try to use outputMode "update"

def upsertToDelta(microBatchOutputDF, batchId): microBatchOutputDF.createOrReplaceTempView("updates")   microBatchOutputDF._jdf.sparkSession().sql(""" MERGE INTO old o USING updates u ON u.id = o.id WHEN MATCHED THEN UPDATE SE...

  • 4854 Views
  • 9 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

Delta table/file version is too old. Please try to upgrade it as described here https://docs.microsoft.com/en-us/azure/databricks/delta/versioning​

  • 3 kudos
8 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels