cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

databicky
by Contributor II
  • 2632 Views
  • 2 replies
  • 0 kudos

how to check the particular column value in spark dataframe ?

if i want​ to check the the particular column in dataframe is need to contain zero, if its not have zero means , it need to get fail

  • 2632 Views
  • 2 replies
  • 0 kudos
Latest Reply
MateuszLomanski
New Contributor II
  • 0 kudos

use the agg method to check if the count of rows where columnName contains 0, is equal to the total number of rows in the dataframe, using the following code: df.agg(count("*").alias("total_count"),count(when(col("columnName")===0,1)).alias("zero_cou...

  • 0 kudos
1 More Replies
Jennifer
by New Contributor III
  • 2566 Views
  • 1 replies
  • 0 kudos

How do I update an aggregate table using a Delta live table

I have am using delta live tables to stream events and I have a raw table for all the events and a downstream aggregate table. I need to add the new aggregated number to the downstream table aggregate column. But I didn't find any recipe talking abou...

  • 2566 Views
  • 1 replies
  • 0 kudos
Latest Reply
Jennifer
New Contributor III
  • 0 kudos

Maybe my code is correct already since I use dlt.read("my_raw_table") instead of delta.read_stream("my_raw_table"). So the col_aggr is recalculated completely every time my_raw_table is updated.

  • 0 kudos
Valon98
by New Contributor III
  • 13091 Views
  • 8 replies
  • 4 kudos

Resolved! During execution of a cell "RuntimeException: The python kernel is unresponsive."

Hi all, I am running a preprocessing to create my trainset and test set. Does anyone know why during the execution my cell gives the error "RuntimeException: The python kernel is unresponsive." ? How can I solve it?

  • 13091 Views
  • 8 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hey there @Valerio Goretti​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from...

  • 4 kudos
7 More Replies
DanielBarbosa
by New Contributor III
  • 5739 Views
  • 3 replies
  • 3 kudos

Running jobs using notebooks in a remote Azure DevOps Services (Repos) Git repository is generating "Notebook not found" error.

By reading the documentation, we checked the possibility of running jobs in the Azure Databricks Workspace workflow using Azure DevOps Services repository source codes.The instructions in the documentation were followed and we configured the git info...

image image image image
  • 5739 Views
  • 3 replies
  • 3 kudos
Latest Reply
Ulf
New Contributor II
  • 3 kudos

I have the same challenge when integrating with Github repos. However I did not succeed including: '# Databricks notebook source' in the top of python files. Do you have any additional suggestions for solving this problem? @Vaibhav Sethi​ 

  • 3 kudos
2 More Replies
Prabha
by New Contributor II
  • 2318 Views
  • 4 replies
  • 0 kudos

DataBricks_dataengineer_result

Hi Team,I've successfully passed the Databricks Data Engineer Associate Certified exam on 30th October 2022. but still have not received the certificate.Please find the reference below.  I have registered the exam with the email id vprabhakaran1987@...

image
  • 2318 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Prabhakaran velusamy​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from y...

  • 0 kudos
3 More Replies
Sai1996
by New Contributor II
  • 1327 Views
  • 2 replies
  • 1 kudos

Screenshot_20221031-222242_Chrome

Hi Databricks Team,​I​ have completed my databricks Certified data engineer Associate exam on Oct 30 but not received badge or certificate yet. Raised case alo but no response..could someone helpUserid: rupinisarguru128@gmail.com ​​​

  • 1327 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Rupini Ravichandran​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from yo...

  • 1 kudos
1 More Replies
Own
by Contributor
  • 998 Views
  • 2 replies
  • 0 kudos
  • 998 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Shubham Sharma​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 0 kudos
1 More Replies
Renga87
by New Contributor II
  • 2095 Views
  • 4 replies
  • 2 kudos

Clarification on Voucher code

Is voucher code is mandatory to register exam "Databricks Associate developer for Apache spark 3.0" If so how to get it. Note : Created account in databricks portal recently and username is ksrenga87@gmail.com

  • 2095 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Rengaraja Sundararaj​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from y...

  • 2 kudos
3 More Replies
Sinead
by New Contributor II
  • 2019 Views
  • 2 replies
  • 0 kudos

Resolved! Why do I keep getting locked out of my Community Edition Account?

I use Databricks Community Edition for college work and every time I try to log in, I find that I am locked out of my account, despite using the correct username and password. I get the message "You entered an invalid email or password, or your works...

  • 2019 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Sinead Walsh​ Thank you for reaching out, and we’re sorry to hear about this log-in issue! We have this Community Edition login troubleshooting post on Community. Please take a look, and follow the troubleshooting steps. If the steps do not resol...

  • 0 kudos
1 More Replies
Cano
by New Contributor III
  • 3674 Views
  • 4 replies
  • 2 kudos

SQL warehouse failing to start ( Please check network connectivity from the data plane to the control plane )

Hi, My SQL warehouse is failing to start with the following error message:Details for the latest failure: Error: [id: InstanceId(i-01b84b6705ff09104), status: INSTANCE_INITIALIZING, workerEnvId:WorkerEnvId(workerenv-3023557811934763-c8cef827-a038-455...

  • 3674 Views
  • 4 replies
  • 2 kudos
Latest Reply
Debayan
Databricks Employee
  • 2 kudos

Hi, There is a line in the attached logs as below:[Bootstrap Event] Can reach ohio.cloud.databricks.com: [FAILED][Bootstrap Event] DNS output for databricks-prod-artifacts-us-east-2.s3.us-east-2.amazonaws.com: Server: 10.187.0.2Address: 10.187.0.2#5...

  • 2 kudos
3 More Replies
dheeraj2444
by New Contributor II
  • 2584 Views
  • 3 replies
  • 0 kudos

I am trying to write a data frame to Kafka topic with Avro schema for key and value using a schema registry URL. The to_avro function is not writing t...

I am trying to write a data frame to Kafka topic with Avro schema for key and value using a schema registry URL. The to_avro function is not writing to the topic and throwing an exception with code 40403 something. Is there an alternate way to do thi...

  • 2584 Views
  • 3 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi,Could you please refer to https://github.com/confluentinc/kafka-connect-elasticsearch/issues/59 and let us know if this helps.

  • 0 kudos
2 More Replies
ackerman_chris
by New Contributor III
  • 2645 Views
  • 4 replies
  • 0 kudos

Resolved! Databricks Lakehouse Fundamentals Badge Not Found

Hello, I've successfully completed the Databricks Lakehouse Fundamentals and am looking to find where the badge is.I found this post here. But I haven't received email on my completion from <service.accredible.email@databricks.com> yet. I successfull...

  • 2645 Views
  • 4 replies
  • 0 kudos
Latest Reply
ackerman_chris
New Contributor III
  • 0 kudos

Thank You all for the great responses, I eventually received the Badge, it took around 30+ minutes to receive, but I finally did get the Email notification. I will mark this post as resolved

  • 0 kudos
3 More Replies
KrishZ
by Contributor
  • 16501 Views
  • 3 replies
  • 3 kudos

[Pyspark.Pandas] PicklingError: Could not serialize object (this error is happening only for large datasets)

Context: I am using pyspark.pandas in a Databricks jupyter notebook and doing some text manipulation within the dataframe..pyspark.pandas is the Pandas API on Spark and can be used exactly the same as usual PandasError: PicklingError: Could not seria...

  • 16501 Views
  • 3 replies
  • 3 kudos
Latest Reply
ryojikn
New Contributor III
  • 3 kudos

@Krishna Zanwar​ , i'm receiving the same error.​For me, the behavior is when trying to broadcast a random forest (sklearn 1.2.0) recently loaded from mlflow, and using Pandas UDF to predict a model.​However, the same code works perfectly on Spark 2....

  • 3 kudos
2 More Replies
anujsen18
by New Contributor
  • 2918 Views
  • 2 replies
  • 0 kudos

How to overwrite partition in DLT pipeline ?

I am trying to replicate my existing spark pipeline in DLT. I am not able to achieve desired result using DLT . Current pipeline : source set up : CSV file ingested in bronze using SCP frequency : monthly bronze dir : /cntdlt/bronze/emp/year=2022 /...

  • 2918 Views
  • 2 replies
  • 0 kudos
Latest Reply
kfoster
Contributor
  • 0 kudos

What I have observed, @dlt.table with a spark.read or dlt.read will create the table in mode=overwrite@dlt.table with a spark.readStream or dlt.readStream will append new datato get the update, use the CDC: Change data capture with Delta Live Tables ...

  • 0 kudos
1 More Replies
SIRIGIRI
by Contributor
  • 938 Views
  • 1 replies
  • 1 kudos

sharikrishna26.medium.com

Difference between “ And ‘ in Spark Dataframe APIYou must tell your compiler that you want to represent a string inside a string using a different symbol for the inner string.Here is an example.“ Name = “HARI” “The above is wrong. Why? Because the in...

  • 938 Views
  • 1 replies
  • 1 kudos
Latest Reply
sher
Valued Contributor II
  • 1 kudos

thanks for sharing

  • 1 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels