cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Etyr
by Contributor II
  • 10037 Views
  • 4 replies
  • 4 kudos

Resolved! Generate longer token for Databricks with Azure.

I'm using DefaultAzureCredential from azure-identity to connect to Azure with service principal environment variables (AZURE_CLIENT_SECRET, AZURE_TENANT_ID, AZURE_CLIENT_ID).I can get_token from a specific scope for databricks like this:from azure.id...

  • 10037 Views
  • 4 replies
  • 4 kudos
Latest Reply
Etyr
Contributor II
  • 4 kudos

I made up an alternative solution. I made up my own python class to handle my PAT from Databricks : https://stackoverflow.com/questions/75071869/python-defaultazurecredential-get-token-set-expiration-or-renew-token/You can be fancier or even register...

  • 4 kudos
3 More Replies
Etyr
by Contributor II
  • 12970 Views
  • 3 replies
  • 2 kudos

Resolved! slow Fetching results by client in databricks SQL calling from Azure Compute Instance (AML)

I'm using `databricks-sql-connector` in python3.8 to connect to an Azure SQL Wharehouse inside an Azure Machine Learning Compute Instance.I have this large result query, looking at the `query history` I check the time spent on doing the query, and se...

first_time_query
  • 12970 Views
  • 3 replies
  • 2 kudos
Latest Reply
Etyr
Contributor II
  • 2 kudos

So I made some few tests. Since you said that the Databricks SQL driver wasn't made to retrieve that amount of data. I went on Spark.I fired up a small spark cluster, the query was as fast as on SQL Warehouse, then I did a df.write.parquet("/my_path/...

  • 2 kudos
2 More Replies
Tacuma
by New Contributor II
  • 3248 Views
  • 4 replies
  • 1 kudos

Scheduling jobs with Airflow result in each task running multiple jobs.

Hey everyone, I'm experiementing with running containerized pyspark jobs in Databricks, and orchestrating them with airflow. I am however, encountering an issue here. When I trigger an airflow DAG, and I look at the logs, I see that airflow is spinni...

  • 3248 Views
  • 4 replies
  • 1 kudos
Latest Reply
Tacuma
New Contributor II
  • 1 kudos

Both, I guess? Yes, all jobs share the same config - the question I have is why in the same airflow task log, there are 3 jobs runs. I'm hoping that there's something in the configs and may give me some kind of clue.

  • 1 kudos
3 More Replies
databicky
by Contributor II
  • 3754 Views
  • 2 replies
  • 0 kudos

how to check the particular column value in spark dataframe ?

if i want​ to check the the particular column in dataframe is need to contain zero, if its not have zero means , it need to get fail

  • 3754 Views
  • 2 replies
  • 0 kudos
Latest Reply
MateuszLomanski
New Contributor II
  • 0 kudos

use the agg method to check if the count of rows where columnName contains 0, is equal to the total number of rows in the dataframe, using the following code: df.agg(count("*").alias("total_count"),count(when(col("columnName")===0,1)).alias("zero_cou...

  • 0 kudos
1 More Replies
Jennifer
by New Contributor III
  • 3565 Views
  • 1 replies
  • 0 kudos

How do I update an aggregate table using a Delta live table

I have am using delta live tables to stream events and I have a raw table for all the events and a downstream aggregate table. I need to add the new aggregated number to the downstream table aggregate column. But I didn't find any recipe talking abou...

  • 3565 Views
  • 1 replies
  • 0 kudos
Latest Reply
Jennifer
New Contributor III
  • 0 kudos

Maybe my code is correct already since I use dlt.read("my_raw_table") instead of delta.read_stream("my_raw_table"). So the col_aggr is recalculated completely every time my_raw_table is updated.

  • 0 kudos
Valon98
by New Contributor III
  • 15570 Views
  • 8 replies
  • 4 kudos

Resolved! During execution of a cell "RuntimeException: The python kernel is unresponsive."

Hi all, I am running a preprocessing to create my trainset and test set. Does anyone know why during the execution my cell gives the error "RuntimeException: The python kernel is unresponsive." ? How can I solve it?

  • 15570 Views
  • 8 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hey there @Valerio Goretti​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from...

  • 4 kudos
7 More Replies
DanielBarbosa
by New Contributor III
  • 9339 Views
  • 3 replies
  • 3 kudos

Running jobs using notebooks in a remote Azure DevOps Services (Repos) Git repository is generating "Notebook not found" error.

By reading the documentation, we checked the possibility of running jobs in the Azure Databricks Workspace workflow using Azure DevOps Services repository source codes.The instructions in the documentation were followed and we configured the git info...

image image image image
  • 9339 Views
  • 3 replies
  • 3 kudos
Latest Reply
Ulf
New Contributor II
  • 3 kudos

I have the same challenge when integrating with Github repos. However I did not succeed including: '# Databricks notebook source' in the top of python files. Do you have any additional suggestions for solving this problem? @Vaibhav Sethi​ 

  • 3 kudos
2 More Replies
Prabha
by New Contributor II
  • 3999 Views
  • 4 replies
  • 0 kudos

DataBricks_dataengineer_result

Hi Team,I've successfully passed the Databricks Data Engineer Associate Certified exam on 30th October 2022. but still have not received the certificate.Please find the reference below.  I have registered the exam with the email id vprabhakaran1987@...

image
  • 3999 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Prabhakaran velusamy​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from y...

  • 0 kudos
3 More Replies
rupiniravi
by New Contributor II
  • 2006 Views
  • 2 replies
  • 1 kudos

Screenshot_20221031-222242_Chrome

Hi Databricks Team,​I​ have completed my databricks Certified data engineer Associate exam on Oct 30 but not received badge or certificate yet. Raised case alo but no response..could someone helpUserid: rupinisarguru128@gmail.com ​​​

  • 2006 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Rupini Ravichandran​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from yo...

  • 1 kudos
1 More Replies
Own
by Contributor
  • 1455 Views
  • 2 replies
  • 0 kudos
  • 1455 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Shubham Sharma​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 0 kudos
1 More Replies
Renga87
by New Contributor II
  • 3191 Views
  • 4 replies
  • 2 kudos

Clarification on Voucher code

Is voucher code is mandatory to register exam "Databricks Associate developer for Apache spark 3.0" If so how to get it. Note : Created account in databricks portal recently and username is ksrenga87@gmail.com

  • 3191 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Rengaraja Sundararaj​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from y...

  • 2 kudos
3 More Replies
Sinead
by New Contributor II
  • 3201 Views
  • 2 replies
  • 0 kudos

Resolved! Why do I keep getting locked out of my Community Edition Account?

I use Databricks Community Edition for college work and every time I try to log in, I find that I am locked out of my account, despite using the correct username and password. I get the message "You entered an invalid email or password, or your works...

  • 3201 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Sinead Walsh​ Thank you for reaching out, and we’re sorry to hear about this log-in issue! We have this Community Edition login troubleshooting post on Community. Please take a look, and follow the troubleshooting steps. If the steps do not resol...

  • 0 kudos
1 More Replies
Cano
by New Contributor III
  • 5581 Views
  • 4 replies
  • 2 kudos

SQL warehouse failing to start ( Please check network connectivity from the data plane to the control plane )

Hi, My SQL warehouse is failing to start with the following error message:Details for the latest failure: Error: [id: InstanceId(i-01b84b6705ff09104), status: INSTANCE_INITIALIZING, workerEnvId:WorkerEnvId(workerenv-3023557811934763-c8cef827-a038-455...

  • 5581 Views
  • 4 replies
  • 2 kudos
Latest Reply
Debayan
Databricks Employee
  • 2 kudos

Hi, There is a line in the attached logs as below:[Bootstrap Event] Can reach ohio.cloud.databricks.com: [FAILED][Bootstrap Event] DNS output for databricks-prod-artifacts-us-east-2.s3.us-east-2.amazonaws.com: Server: 10.187.0.2Address: 10.187.0.2#5...

  • 2 kudos
3 More Replies
dheeraj2444
by New Contributor II
  • 4031 Views
  • 3 replies
  • 0 kudos

I am trying to write a data frame to Kafka topic with Avro schema for key and value using a schema registry URL. The to_avro function is not writing t...

I am trying to write a data frame to Kafka topic with Avro schema for key and value using a schema registry URL. The to_avro function is not writing to the topic and throwing an exception with code 40403 something. Is there an alternate way to do thi...

  • 4031 Views
  • 3 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi,Could you please refer to https://github.com/confluentinc/kafka-connect-elasticsearch/issues/59 and let us know if this helps.

  • 0 kudos
2 More Replies
ackerman_chris
by New Contributor III
  • 3734 Views
  • 4 replies
  • 0 kudos

Resolved! Databricks Lakehouse Fundamentals Badge Not Found

Hello, I've successfully completed the Databricks Lakehouse Fundamentals and am looking to find where the badge is.I found this post here. But I haven't received email on my completion from <service.accredible.email@databricks.com> yet. I successfull...

  • 3734 Views
  • 4 replies
  • 0 kudos
Latest Reply
ackerman_chris
New Contributor III
  • 0 kudos

Thank You all for the great responses, I eventually received the Badge, it took around 30+ minutes to receive, but I finally did get the Email notification. I will mark this post as resolved

  • 0 kudos
3 More Replies
Labels