cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

bluetail
by Contributor
  • 1987 Views
  • 4 replies
  • 2 kudos

Resolved! Value Labels fail to display in Databricks notebook but they are displayed ok in Jupyter

import matplotlib.pyplot as pltimport seaborn as snsimport pandas as pdimport numpy as npprob = np.random.rand(7) + 0.1prob /= prob.sum()df = pd.DataFrame({'department': np.random.choice(['helium', 'neon', 'argon', 'krypton', 'xenon', 'radon', 'ogane...

  • 1987 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Maria Bruevich​ - Do either of these answers help? If yes, would you be happy to mark one as best so that other members can find the solution more quickly?

  • 2 kudos
3 More Replies
kolangareth
by New Contributor III
  • 3710 Views
  • 9 replies
  • 3 kudos

Resolved! to_date not functioning as expected after introduction of arbitrary replaceWhere in Databricks 9.1 LTS

I am trying to do a dynamic partition overwrite on delta table using replaceWhere option. This was working fine until I upgraded the DB runtime to 9.1 LTS from 8.3.x. I am concatenating 'year', 'month' and 'day' columns and then using to_date functio...

  • 3710 Views
  • 9 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

@Prasanth Kolangareth​ - Does Hubert's answer resolve the problem for you? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?

  • 3 kudos
8 More Replies
Soma
by Valued Contributor
  • 2491 Views
  • 6 replies
  • 3 kudos

Resolved! Dynamically supplying partitions to autoloader

We are having a streaming use case and we see a lot of time in listing from azure.Is it possible to supply partition to autoloader dynamically on the fly

  • 2491 Views
  • 6 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

@somanath Sankaran​ - Thank you for posting your solution. Would you be happy to mark your answer as best so that other members may find it more quickly?

  • 3 kudos
5 More Replies
thomasthomas
by New Contributor II
  • 1664 Views
  • 4 replies
  • 0 kudos

Resolved! Customer deployment

Hi,I have a bunch of scripts in Databricks that perform a decent amount of data-wrangling. All of these scripts contain sensitive information and I have no intention of making them public.I would like to provide a service to my customers - so they ca...

  • 1664 Views
  • 4 replies
  • 0 kudos
Latest Reply
Atanu
Esteemed Contributor
  • 0 kudos

@Tamas D​  I understood your concern. For cluster creation in different subscription I think that's by design at this moment. But I would like to request you to add your use case to https://feedback.azure.com/d365community/forum/2efba7dc-ef24-ec11-b6...

  • 0 kudos
3 More Replies
Mateo
by New Contributor II
  • 810 Views
  • 2 replies
  • 0 kudos

Hi all, I'm having some trouble with my Certification Transcript in the Academy Portal. I've passed "Databricks Certified Associate Devel...

Hi all,I'm having some trouble with my Certification Transcript in the Academy Portal. I've passed "Databricks Certified Associate Developer for Apache Spark 3.0" last year and everything seemed fine (apart from the fact that I've been issued two sep...

  • 810 Views
  • 2 replies
  • 0 kudos
Latest Reply
Mateo
New Contributor II
  • 0 kudos

Hey @Piper Wilson​ ! Thank you for your response. Unfortunately, I already created a support ticket through the address provided in this post you mentioned. And I got a 'case closed' e-mail after over two weeks with no response and no fix (certificat...

  • 0 kudos
1 More Replies
MattM
by New Contributor III
  • 1540 Views
  • 3 replies
  • 2 kudos

Resolved! Pricing Spot Instance vs New Job Cluster

We are running multiple Databricks job via ADF. I was wondering which option out of the below is a cheaper route for databricks notebook processing from ADF. When I create a ADF linked service, which should I use to lower my cost.New Job Cluster opti...

  • 1540 Views
  • 3 replies
  • 2 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 2 kudos

the instance pool will be cheaper if you use spot instances. But only if you size your instance pool correctly. (number of workers and scale down time)AFAIK you cannot use spot instances for job clusters in ADF

  • 2 kudos
2 More Replies
swzzzsw
by New Contributor III
  • 3214 Views
  • 4 replies
  • 0 kudos

Resolved! SQLServerException: deadlock

I'm using databricks to connect to a SQL managed instance via JDBC. SQL operations I need to perform include DELETE, UPDATE, and simple read and write. Since spark syntax only handles simple read and write, I had to open SQL connection using Scala an...

image.png
  • 3214 Views
  • 4 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

this is not a spark error but purely the database.There are tons of articles online on how to prevent deadlocks, but there is no single solution for this.

  • 0 kudos
3 More Replies
swzzzsw
by New Contributor III
  • 3710 Views
  • 5 replies
  • 2 kudos

Resolved! Pass variable values from one task to another

I created a Databricks job with multiple tasks. Is there a way to pass variable values from one task to another. For example, if I have tasks A and B as Databricks notebooks. Can I create a variable (e.g. x) in notebook A and later use that value in ...

  • 3710 Views
  • 5 replies
  • 2 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 2 kudos

you could also consider using an orchestration tool like Data Factory (Azure) or Glue (AWS). there you can inject and use parameters from notebooks.The job scheduling of databricks also has the possibility to add parameters, but I do not know if yo...

  • 2 kudos
4 More Replies
MiguelKulisic
by New Contributor II
  • 6063 Views
  • 2 replies
  • 4 kudos

Resolved! ProtocolChangedException on concurrent blind appends to delta table

Hello, I am developing an application that runs multiple processes that write their results to a common delta table as blind appends. According to the docs I've read online: https://docs.databricks.com/delta/concurrency-control.html#protocolchangedex...

  • 6063 Views
  • 2 replies
  • 4 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 4 kudos

I think you are right, the mergeSchema will change the schema of the table, but if you both write to that same table with another schema, which one will it be?Can you check if both of you actually write the same schema, or remove the mergeschema?

  • 4 kudos
1 More Replies
Maverick1
by Valued Contributor II
  • 14333 Views
  • 14 replies
  • 7 kudos

Resolved! How to deploy a databricks managed workspace model to sagemaker from databricks notebook

I wanted to deploy a registered model present in databricks managed MLFlow to a sagemaker via databricks notebook?As of now, it is not able to run mlflow sagemaker build-and-push container command directly. What all configurations or steps needed to ...

  • 14333 Views
  • 14 replies
  • 7 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 7 kudos

Hi @Saurabh Verma​ , Yes, it's the right process. Thanks.

  • 7 kudos
13 More Replies
study_community
by New Contributor III
  • 1922 Views
  • 2 replies
  • 3 kudos

Resolved! Error creating delta table over an existing delta schema

I created a delta table through a cluster over a dbfs location .Schema :create external table tmp_db.delta_data(delta_id int ,delta_name varchar(20) , delta_variation decimal(10,4) ,delta_incoming_timestamp timestamp,delta_date date generated always ...

  • 1922 Views
  • 2 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

varchartype is only available as from spark 3.1 I think.https://spark.apache.org/docs/latest/sql-ref-datatypes.htmlThe link is for spark 3.2, and 3.1 also has varchartype. So can you check your spark version?Also if the table definition still exists...

  • 3 kudos
1 More Replies
sh_abrishami_ie
by New Contributor II
  • 3942 Views
  • 3 replies
  • 3 kudos

Resolved! Driver is up but is not responsive, likely due to GC.

Hi,I have a problem with writing an excel file into the mounted file.after 10 mins I see the Driver is up but is not responsive, likely due to GC on the log events.I'm using the following script:df.repartition(1).write .format("com.crealytics.spark....

  • 3942 Views
  • 3 replies
  • 3 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 3 kudos

Hi @Shokoufeh Abrishami​ , Can you show the error stack or the logs?

  • 3 kudos
2 More Replies
Soma
by Valued Contributor
  • 2069 Views
  • 7 replies
  • 0 kudos

Resolved! Queries regarding workspace Migration to Premium

We are planning to migrate from standard to premium workspaceWe need to know if below artifacts will be maintainedneed to check on streaming Job DowntimeAccess token DBFS Access Production Cluster /JobsCluster ID Job ID and other properties like URL ...

  • 2069 Views
  • 7 replies
  • 0 kudos
Latest Reply
Soma
Valued Contributor
  • 0 kudos

hi @Kaniz Fatma​  then I can assume there wont be any impact on metastore and all the metadata(table definition,schema ) will be available post upgradation

  • 0 kudos
6 More Replies
Ketna
by New Contributor
  • 1287 Views
  • 2 replies
  • 1 kudos

Resolved! I have included SparkJDBC42.jar in my war file. but when i start my application using tomcat, i get EOFExceptions from log4j classes. I need help with what is causing this and How to resolve this issue? Please help.

Below is part of the exceptions I am getting:org.apache.catalina.startup.ContextConfig processAnnotationsJarSEVERE: Unable to process Jar entry [com/simba/spark/jdbc42/internal/apache/logging/log4j/core/pattern/ThreadIdPatternConverter.class] from Ja...

  • 1287 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Ketna Khalasi​ , For Log4j related queries, please go through this post.

  • 1 kudos
1 More Replies
anthony_cros
by New Contributor
  • 2955 Views
  • 2 replies
  • 0 kudos

Resolved! How to publish a notebook in order to share its URL, as a Premium Plan user?

Hi,I'm a Premium Plan user and am trying to share a notebook via URL.The link at https://docs.databricks.com/notebooks/notebooks-manage.html#publish-a-notebook states: "If you’re using Community Edition, you can publish a notebook so that you can sha...

  • 2955 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hello @Anthony Cros​ - My name is Piper, and I'm a moderator for Databricks. Welcome and thank you for your question. We will give the members some time to answer your question. If needed, we will circle back around later.

  • 0 kudos
1 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels