cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Kaijser
by New Contributor II
  • 3471 Views
  • 3 replies
  • 1 kudos

Logging clogged up with error messages (OSError: [Errno 95] Operation not supported, --- Logging error ---)

I have encountered this issue for a while now and it happens each run that is triggered. I discovered 2 things:1) If I run my script on a cluster that is not active and the cluster is activated by a scheduled trigger (not manually!) this doesn't happ...

  • 3471 Views
  • 3 replies
  • 1 kudos
Latest Reply
manasa
Contributor
  • 1 kudos

Hi @Aaron Kaijser​ Are you able to your logfile to ADLS?If yes, could you please explain how you did it

  • 1 kudos
2 More Replies
Retko
by Contributor
  • 5025 Views
  • 2 replies
  • 3 kudos

Resolved! How to quickly check if Delta Table is Empty

Hi,I need some quick way to return True if Delta Table is Empty.Tried this, but is is quite slow when checking more tables.spark.read.table("table_name").count()spark.read.table("table_name").rdd.isEmpty()len(spark.read.table("table_name").head(1)) =...

  • 5025 Views
  • 2 replies
  • 3 kudos
Latest Reply
Vartika
Moderator
  • 3 kudos

Hi @Retko Okter​ Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best? If not, please tell us so we can help you.Thanks!

  • 3 kudos
1 More Replies
bharathi
by New Contributor
  • 1142 Views
  • 2 replies
  • 1 kudos

Hive database

The hive database and tables created in my workspace is not visible for other users when we were trying to access the databricks created at our work place

  • 1142 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vartika
Moderator
  • 1 kudos

Hi @bharathi vish​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedbac...

  • 1 kudos
1 More Replies
uzairm
by New Contributor III
  • 4939 Views
  • 2 replies
  • 1 kudos

My whole code is running on driver node, I want my code to run on worker nodes so that the memory of driver node is not exhausted. Please tell me improvement is my codes. My spark crashes frequently when the pulled data from s3 is huge.

I am running process which has 4 steps.Querying s3 file paths from dynamo DB based on certain parameters given by user. (function to do so provided by client, just have to import). Returns a list of filesCheck if those file paths have already been qu...

  • 4939 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vartika
Moderator
  • 1 kudos

Hi @uzair mustafa​ Thank you for posting your question in our community! We are happy to assist you.Does @Suteja Kanuri​'s answer help? If it does, would you be happy to mark it as best?This will help other community members who may have similar ques...

  • 1 kudos
1 More Replies
Joao_DE
by New Contributor III
  • 1865 Views
  • 2 replies
  • 0 kudos

Run pytest inside repos and store the results in dbfs

Hi everyone!I am trying to run pytest inside a notebook on repos and store the results inside dbfs but i am getting an error stating permission denied, does anyone know why this happens and the solution. Error:

image image
  • 1865 Views
  • 2 replies
  • 0 kudos
Latest Reply
Vartika
Moderator
  • 0 kudos

Hi @João Peixoto​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so ...

  • 0 kudos
1 More Replies
pvignesh92
by Honored Contributor
  • 8030 Views
  • 4 replies
  • 2 kudos

Resolved! Pls restrict Spamming

Hi @Vidula Khanna​ , Recently there has been too many spams posted in the community discussions. I'm sure you might have noticed them. Is there any chance to clear all of them and may be restrict them in some way so that the purpose of this community...

  • 8030 Views
  • 4 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

@Suteja Kanuri​ using Azure OpenAI GPT4 we could connect it to the community and use 2 features of it.Ask the question, "is it spam?" and verify the user post this way.Display ready answers by OpenAI so we will avoid asking over and over again duplic...

  • 2 kudos
3 More Replies
pandu
by New Contributor II
  • 2138 Views
  • 2 replies
  • 3 kudos

connect to Oracle database using JDBC and perform merge condition

I would like to connect to oracle database using JDBC driver and write a code to perform merge condition using python.

  • 2138 Views
  • 2 replies
  • 3 kudos
Latest Reply
Vartika
Moderator
  • 3 kudos

Hi @Venkata Krishna Jonnalagadda​ Hope you are well.Just checking in. If @John Lourdu​'s answer helped, would you let us know and mark the answer as best? If not, would you be happy to give us more information?Thanks!

  • 3 kudos
1 More Replies
William_Scardua
by Valued Contributor
  • 1118 Views
  • 2 replies
  • 1 kudos

How to get executors info by SDK (Python)

Hi guys,How I get executors information to my cluster by SDK (Python) have any idea ?Thank you

executors
  • 1118 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vartika
Moderator
  • 1 kudos

Hi @William Scardua​ We haven't heard from you since the last response from @josephk and I was checking back to see if it helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to others. Also, Please d...

  • 1 kudos
1 More Replies
Anonymous
by Not applicable
  • 7301 Views
  • 4 replies
  • 4 kudos

How to create a new group in the Databricks community? Dear esteemed community users, It is with great pleasure that we inform you of an important upd...

How to create a new group in the Databricks community?Dear esteemed community users,It is with great pleasure that we inform you of an important update regarding the creation of Groups on Community. As part of our continuous efforts to enhance your e...

  • 7301 Views
  • 4 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

HI @Hubert Dudek​  and @Ratna Chaitanya Raju Bandaru​ : Thanks for pointing this out. This is going to be a design decision which we will take after looking into the ask carefully. Thanks for getting the conversation going. This really helps us.

  • 4 kudos
3 More Replies
JordiDekker
by New Contributor III
  • 2687 Views
  • 5 replies
  • 6 kudos

StreamCorruptedException, databricks-connect 9.1

Last week, around the 21st of march, we started having issues with databricks-connect (DBR 9.1 LTS). "databricks-connect test" works, but the following code snippet:from pyspark.sql import SparkSession     spark = SparkSession.builder.getOrCreate() s...

  • 2687 Views
  • 5 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @Jordi Dekker​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

  • 6 kudos
4 More Replies
Gk
by New Contributor III
  • 3326 Views
  • 2 replies
  • 1 kudos

DataFrame

How can we create empty dataframe in databricks and how many ways we can create dataframe?

  • 3326 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vartika
Moderator
  • 1 kudos

Hi @Govardhana Reddy​ Hope everything is going great.Does @Suteja Kanuri​'s answer help? If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you. Cheers!

  • 1 kudos
1 More Replies
tlbarata
by New Contributor II
  • 2230 Views
  • 3 replies
  • 1 kudos

Outdated - Databricks Data Engineer associate v2 lesson DE 4.2

While following the video lesson and executing the notebook 4.2, I noticed that creating the CREATE Table "users_jdbc" command generates an EXTERNAL table, while the video and, notebook too, suggests it as being a Managed table.Here are some printscr...

1 - Create Table Describe extended command Decribe command from video lesson
  • 2230 Views
  • 3 replies
  • 1 kudos
Latest Reply
Vartika
Moderator
  • 1 kudos

Hi @Tiago Barata​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...

  • 1 kudos
2 More Replies
alejandrofm
by Valued Contributor
  • 2682 Views
  • 4 replies
  • 0 kudos

AppendDataExecV1 Taking a lot of time

Hi, I have a Pyspark job that takes about an hour to complete, when looking at the SQL tab on Spark UI I see this:Those processes run for more than 1 minute on a 60-minute process.This is Ganglia for that period (the last snapshot, will look into a l...

image image
  • 2682 Views
  • 4 replies
  • 0 kudos
Latest Reply
Vartika
Moderator
  • 0 kudos

Hi @Alejandro Martinez​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you...

  • 0 kudos
3 More Replies
ghofigjong
by New Contributor
  • 5685 Views
  • 2 replies
  • 1 kudos

Resolved! How does partition pruning work on a merge into statement?

I have a delta table that is partitioned by Year, Date and month. I'm trying to merge data to this on all three partition columns + an extra column (an ID). My merge statement is below:MERGE INTO delta.<path of delta table> oldData using df newData ...

  • 5685 Views
  • 2 replies
  • 1 kudos
Latest Reply
Umesh_S
New Contributor II
  • 1 kudos

Isn't the suggested idea only filtering the input dataframe (resulting in a smaller amount of data to match across the whole delta table) rather than prune the delta table for relevant partitions to scan?

  • 1 kudos
1 More Replies
Anonymous
by Not applicable
  • 8364 Views
  • 3 replies
  • 14 kudos

Resolved! No suitable driver error When configure the Databricks ODBC and JDBC drivers

Hi all,I've just encountered with this issue. Before I launched an My SQL database in RDS of AWS after use this simple code to create connection to it but it all fails with this error.Is there any additional step? or could anyone can take a look on i...

Image
  • 8364 Views
  • 3 replies
  • 14 kudos
Latest Reply
Jag
New Contributor III
  • 14 kudos

Hello, It looks issue with JDBC URL. When I am trying to access the Azure SQL database. I was facing the same issue. So I have created JDBC URL as below and it went well.jdbc:sqlserver://<serverurl>:1433;database=<databasename>;user=<username>@<serve...

  • 14 kudos
2 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels