cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

bluetail
by Contributor
  • 3412 Views
  • 4 replies
  • 2 kudos

Resolved! Value Labels fail to display in Databricks notebook but they are displayed ok in Jupyter

import matplotlib.pyplot as pltimport seaborn as snsimport pandas as pdimport numpy as npprob = np.random.rand(7) + 0.1prob /= prob.sum()df = pd.DataFrame({'department': np.random.choice(['helium', 'neon', 'argon', 'krypton', 'xenon', 'radon', 'ogane...

  • 3412 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Maria Bruevich​ - Do either of these answers help? If yes, would you be happy to mark one as best so that other members can find the solution more quickly?

  • 2 kudos
3 More Replies
guruv
by New Contributor III
  • 6550 Views
  • 4 replies
  • 1 kudos

Resolved! Saprk UI not showing any running tasks

HI,I am running a Notebook job calling a JAR code (application code implmented in C#). in the Spark UI page for almost 2 hrs, it'w not showing any tasks and even the CPU usage is below 20%, memory usage is very small. Before this 2 hr window it shows...

  • 6550 Views
  • 4 replies
  • 1 kudos
Latest Reply
Atanu
Databricks Employee
  • 1 kudos

If I understood the issue correctly .

  • 1 kudos
3 More Replies
thomasthomas
by New Contributor II
  • 2829 Views
  • 4 replies
  • 0 kudos

Resolved! Customer deployment

Hi,I have a bunch of scripts in Databricks that perform a decent amount of data-wrangling. All of these scripts contain sensitive information and I have no intention of making them public.I would like to provide a service to my customers - so they ca...

  • 2829 Views
  • 4 replies
  • 0 kudos
Latest Reply
Atanu
Databricks Employee
  • 0 kudos

@Tamas D​  I understood your concern. For cluster creation in different subscription I think that's by design at this moment. But I would like to request you to add your use case to https://feedback.azure.com/d365community/forum/2efba7dc-ef24-ec11-b6...

  • 0 kudos
3 More Replies
Mateo
by New Contributor II
  • 1581 Views
  • 2 replies
  • 0 kudos

Hi all, I'm having some trouble with my Certification Transcript in the Academy Portal. I've passed "Databricks Certified Associate Devel...

Hi all,I'm having some trouble with my Certification Transcript in the Academy Portal. I've passed "Databricks Certified Associate Developer for Apache Spark 3.0" last year and everything seemed fine (apart from the fact that I've been issued two sep...

  • 1581 Views
  • 2 replies
  • 0 kudos
Latest Reply
Mateo
New Contributor II
  • 0 kudos

Hey @Piper Wilson​ ! Thank you for your response. Unfortunately, I already created a support ticket through the address provided in this post you mentioned. And I got a 'case closed' e-mail after over two weeks with no response and no fix (certificat...

  • 0 kudos
1 More Replies
MattM
by New Contributor III
  • 2907 Views
  • 3 replies
  • 2 kudos

Resolved! Pricing Spot Instance vs New Job Cluster

We are running multiple Databricks job via ADF. I was wondering which option out of the below is a cheaper route for databricks notebook processing from ADF. When I create a ADF linked service, which should I use to lower my cost.New Job Cluster opti...

  • 2907 Views
  • 3 replies
  • 2 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 2 kudos

the instance pool will be cheaper if you use spot instances. But only if you size your instance pool correctly. (number of workers and scale down time)AFAIK you cannot use spot instances for job clusters in ADF

  • 2 kudos
2 More Replies
swzzzsw
by New Contributor III
  • 6835 Views
  • 5 replies
  • 2 kudos

Resolved! Pass variable values from one task to another

I created a Databricks job with multiple tasks. Is there a way to pass variable values from one task to another. For example, if I have tasks A and B as Databricks notebooks. Can I create a variable (e.g. x) in notebook A and later use that value in ...

  • 6835 Views
  • 5 replies
  • 2 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 2 kudos

you could also consider using an orchestration tool like Data Factory (Azure) or Glue (AWS). there you can inject and use parameters from notebooks.The job scheduling of databricks also has the possibility to add parameters, but I do not know if yo...

  • 2 kudos
4 More Replies
MiguelKulisic
by New Contributor II
  • 8777 Views
  • 2 replies
  • 4 kudos

Resolved! ProtocolChangedException on concurrent blind appends to delta table

Hello, I am developing an application that runs multiple processes that write their results to a common delta table as blind appends. According to the docs I've read online: https://docs.databricks.com/delta/concurrency-control.html#protocolchangedex...

  • 8777 Views
  • 2 replies
  • 4 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 4 kudos

I think you are right, the mergeSchema will change the schema of the table, but if you both write to that same table with another schema, which one will it be?Can you check if both of you actually write the same schema, or remove the mergeschema?

  • 4 kudos
1 More Replies
study_community
by New Contributor III
  • 3535 Views
  • 2 replies
  • 3 kudos

Resolved! Error creating delta table over an existing delta schema

I created a delta table through a cluster over a dbfs location .Schema :create external table tmp_db.delta_data(delta_id int ,delta_name varchar(20) , delta_variation decimal(10,4) ,delta_incoming_timestamp timestamp,delta_date date generated always ...

  • 3535 Views
  • 2 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

varchartype is only available as from spark 3.1 I think.https://spark.apache.org/docs/latest/sql-ref-datatypes.htmlThe link is for spark 3.2, and 3.1 also has varchartype. So can you check your spark version?Also if the table definition still exists...

  • 3 kudos
1 More Replies
Soma
by Valued Contributor
  • 3506 Views
  • 4 replies
  • 0 kudos

Queries regarding workspace Migration to Premium

We are planning to migrate from standard to premium workspaceWe need to know if below artifacts will be maintainedneed to check on streaming Job DowntimeAccess token DBFS Access Production Cluster /JobsCluster ID Job ID and other properties like URL ...

  • 3506 Views
  • 4 replies
  • 0 kudos
Latest Reply
Soma
Valued Contributor
  • 0 kudos

hi @Kaniz Fatma​  then I can assume there wont be any impact on metastore and all the metadata(table definition,schema ) will be available post upgradation

  • 0 kudos
3 More Replies
aksharamaham
by New Contributor
  • 2017 Views
  • 1 replies
  • 0 kudos

Delta Live Table - How to get details of which records were excluded in Quality Checks?

I've been experimenting with DLT and it works well. I'd like to understand where can I see details of which records didn't meet the quality critera?

  • 2017 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hello, @Paresh J​! Welcome and thank you for asking! My name is Piper, and I'm a moderator for Databricks.Let's give the community some time to help before we circle back to you. Thanks in advance for your patience.

  • 0 kudos
Situs_UG300_Off
by New Contributor
  • 589 Views
  • 0 replies
  • 0 kudos

res.cloudinary.com

Link UG300 ada menyediakan depo tipe e- wallet yang dapat dipakai unyuk dapat melaksanakan pembelian ataupun top up saldo ke e- wallet tujuan yang telah ada di dalam web. Adanya berita gembira buat kalian yang tidak mempunyai rekening bank, Jika kali...

  • 589 Views
  • 0 replies
  • 0 kudos
Ravi1979
by New Contributor
  • 2625 Views
  • 1 replies
  • 0 kudos
  • 2625 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hello, @Ravi Param​ - My name is Piper, and I'm one of the moderators here. Thank you for your question! Let's give the community a chance to respond and then we'll circle back if necessary.

  • 0 kudos
frank26364
by New Contributor III
  • 14029 Views
  • 4 replies
  • 0 kudos

Resolved! Command prompt won't let me type the Databricks token

Hi, I am trying to set up Databricks CLI using the command prompt on my computer. I downloaded the Python 3.9 app and successfully ran the command pip install databricks-cliWhen I try to set up the Databricks token, I am able to type my Databricks Ho...

  • 14029 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hey there! You're on a roll today! Thanks for letting us know.

  • 0 kudos
3 More Replies
frank26364
by New Contributor III
  • 36981 Views
  • 5 replies
  • 4 kudos

Resolved! Export Databricks results to Blob in a csv file

Hello everyone,I want to export my data from Databricks to the blob. My Databricks commands select some pdf from my blob, run Form Recognizer and export the output results in my blob. Here is the code: %pip install azure.storage.blob %pip install...

  • 36981 Views
  • 5 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

@Francis Bouliane​ - Thank you for sharing the solution.

  • 4 kudos
4 More Replies
William_Scardua
by Valued Contributor
  • 5675 Views
  • 2 replies
  • 0 kudos

Resolved! how to Intercept Spark Listener with Pyspark ?

hi guys,​It`s possible to intercept Spark Listener with Pyspark to collect indicator like shuffle, skew ratio, etc ?

  • 5675 Views
  • 2 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

interesting question.I know that you can use the SparkListener to collect info, f.e. here.Mind that the class is written in Scala, so my first thought was that it is not possible in python/pyspark.But SO says it is possible, but with a lot of overhea...

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels