cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

naveenprabhun
by New Contributor III
  • 4343 Views
  • 2 replies
  • 3 kudos

Resolved! Unable to read data from ElasticSearch using Databricks (AWS) Cannot detect ES version - Caused by: org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [IP:PORT]

I am trying to read data from ElasticSearch(ES Version 8.5.2) using PySpark on Databricks (13.0 (includes Apache Spark 3.4.0, Scala 2.12)). The ecosystem is on AWS.I am able to run a curl command on the Databricks notebook to the ES ip:port and fetch...

ErrorScreenshot Screenshot 2023-06-01 at 1.25.29 PM
  • 4343 Views
  • 2 replies
  • 3 kudos
Latest Reply
Hoviedo
New Contributor III
  • 3 kudos

I have the same problem, did you find any solution? thanks

  • 3 kudos
1 More Replies
Anonymous
by Not applicable
  • 842 Views
  • 0 replies
  • 0 kudos

 Dear Community-  Get ready to mark your calendars for the upcoming Databricks Community Social event! Happening on June 16th, 2023, this event promis...

 Dear Community- Get ready to mark your calendars for the upcoming Databricks Community Social event! Happening on June 16th, 2023, this event promises to be the ultimate monthly gathering for everyone in the Databricks Community.Join us for an hour ...

community_social
  • 842 Views
  • 0 replies
  • 0 kudos
Anonymous
by Not applicable
  • 577 Views
  • 0 replies
  • 0 kudos

Dear Community,  Have you enrolled into the New Large Language Model Courses with edX yet?As Large Language Model (LLM) applications disrupt countless...

Dear Community, Have you enrolled into the New Large Language Model Courses with edX yet?As Large Language Model (LLM) applications disrupt countless industries, generative AI is becoming an important foundational technology. The demand for LLM-based...

Image
  • 577 Views
  • 0 replies
  • 0 kudos
MohamedThanveer
by New Contributor II
  • 998 Views
  • 1 replies
  • 0 kudos

Databricks Certified Associate Developer for Apache Spark 3.0 - Python Cancellation

I have scheduled an examination on 1st June 2023 and due to personal reason, I have cancelled the examination on 26th May 2023 (more than 72 hours) but I am yet to receive the refund amount. In the auto generated mail it is mentioned that the refund ...

image
  • 998 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

adding @Suteja Kanuri​  and @Vidula Khanna​ for visibility

  • 0 kudos
Ovi
by New Contributor III
  • 1954 Views
  • 1 replies
  • 0 kudos

Spark Dataframe write to Delta format doesn't create a _delta_log

Hello everyone, I have an intermittent issue when trying to create a Delta table for the first time in Databricks: all the data gets converted into parquet at the specified location but the _delta_log is not created or, if created, it's left empty, t...

  • 1954 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Can you list (display) the folder location "deltaLocation"? what files do you see here? have you try to use a new location for testing? do you get the same behavior?

  • 0 kudos
Torlynet
by New Contributor III
  • 1353 Views
  • 1 replies
  • 1 kudos

Can't access databricks

I was updated some scripts when I all of the sudden got a few "internal server errors". I refreshed the webpage a couple of times and now I am unable to login to databricks.When I try to sign in it thinks for a few seconds and then I am rerouted back...

Recording 2023-05-23 at 21.47.23
  • 1353 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

what date and time this issue happened? are you still unable to access to your workspace?

  • 1 kudos
Nis
by New Contributor II
  • 1499 Views
  • 1 replies
  • 2 kudos

Best sequence of using Vacuum, optimize, fsck repair and refresh commands.

I have a delta table whose size will increases gradually now we have around 1.5 crores of rows while running vacuum command on that table i am getting the below error.ERROR: Job aborted due to stage failure: Task 7 in stage 491.0 failed 4 times, most...

  • 1499 Views
  • 1 replies
  • 2 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 2 kudos

Do you have access to the Executor 7 logs? is there a high GC or some other events that is making the heartbeat timeout? would you be able to check the failed stages?

  • 2 kudos
Kayla
by Valued Contributor
  • 2663 Views
  • 1 replies
  • 1 kudos

Resolved! Pydoc / Documentation Module

Does anyone have a recommendation for something along the lines of Pydoc that can be used to aggregate docstrings and the like into documentation pages?I tried Pydoc and it failed because of the magic commands in my repo.

  • 2663 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16539034020
Databricks Employee
  • 1 kudos

Sphinx could be an option here. It parses and render Databricks notebooks in the documentation. You might want to look into it to see if it fits your needs. However, it may not handle magic commands very well, and it assumes your notebooks are export...

  • 1 kudos
kfoster
by Contributor
  • 1901 Views
  • 2 replies
  • 3 kudos

Terraform Global Init Script base64encoding

I am working on converting manual global init scripts into a terraform IaC process for multiple environments. Within terraform, we are using the resource "databricks_global_init_script" and set the content_base64 with the following:base64encoded(<<-...

  • 1901 Views
  • 2 replies
  • 3 kudos
Latest Reply
Atanu
Databricks Employee
  • 3 kudos

I am looking into it @Kristian Foster​ Are you able to get it working?

  • 3 kudos
1 More Replies
HanaSega_97455
by New Contributor II
  • 7865 Views
  • 2 replies
  • 3 kudos

Resolved! drop specific partition from a Delta Table

i have a delta table partitioned by a Date column , I'm trying to use the alter table drop partition command but get ALTER TABLE DROP PARTITION` is not supported for Delta tables erroris there a way to do it?

  • 7865 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Hanan Segal​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers y...

  • 3 kudos
1 More Replies
KiranKondamadug
by New Contributor II
  • 4405 Views
  • 1 replies
  • 2 kudos

Running into delta.exceptions.ConcurrentAppendException even after setting up S3 Multi-Cluster Writes environment via S3 Dynamo DB LogStore

My use-case is to process a dataset worth 100s of partitions in concurrency. The data is partitioned, and they are disjointed. I was facing ConcurrentAppendException due to S3 not supporting the “put-if-absent” consistency guarantee. From Delta Lake ...

  • 4405 Views
  • 1 replies
  • 2 kudos
Latest Reply
Debayan
Databricks Employee
  • 2 kudos

Hi, You can refer to https://docs.databricks.com/optimizations/isolation-level.html#conflict-exceptions and recheck if everything is alright. Please let us know if this helps, also please tag @Debayan​ with your next response which will notify me, Th...

  • 2 kudos
etsyal1e2r3
by Honored Contributor
  • 9791 Views
  • 1 replies
  • 2 kudos

Resolved! Compiling Flattened Dataframe back to Struct Columns

I have a dataframe with this format of columns:[`first.second.third` , `alpha.bravo.test1` , `alpha.bravo.test2`]I'd like to get an output dataframe of this:[ `first` | `alpha` ] ---------------...

image
  • 9791 Views
  • 1 replies
  • 2 kudos
Latest Reply
etsyal1e2r3
Honored Contributor
  • 2 kudos

I have figured out the solution.

  • 2 kudos
fijoy
by Contributor
  • 2053 Views
  • 3 replies
  • 0 kudos

Is there a utility to convert between "/dbfs" and "dbfs:" path strings?

Is there a built-in utility function, e.g., dbutils, that can convert between path strings that start with "dbfs:" and "/dbfs"?Some operations, e.g, copying from one location in DBFS to another using dbutils.fs.cp() expect the path starting with "/db...

  • 2053 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Fijoy Vadakkumpadan​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best a...

  • 0 kudos
2 More Replies
Jujiro
by New Contributor III
  • 8562 Views
  • 11 replies
  • 7 kudos

Random error: At least one column must be specified for the table?

I have the following code in a notebook. It is randomly giving me the error, "At least one column must be specified for the table." The error occurs (if at all it occurs) only on the first run after attaching to a cluster.Cluster details:Summary5-1...

dbr-bug
  • 8562 Views
  • 11 replies
  • 7 kudos
Latest Reply
Harold
New Contributor II
  • 7 kudos

Please check if this could help or not:spark.databricks.delta.catalog.update.enabled false

  • 7 kudos
10 More Replies
LidorAbo
by New Contributor II
  • 6425 Views
  • 1 replies
  • 1 kudos

bucket ownership of s3 bucket in databricks

We had a databricks job that has strange behavior,when we passing 'output_path' to function saveAsTextFile and not output_path variable the data saved to the following path: s3://dev-databricks-hy1-rootbucket/nvirginiaprod/3219117805926709/output_pa...

s3
  • 6425 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16752239289
Databricks Employee
  • 1 kudos

I suspect you provided a dbfs path to save the data hence the data saved under your workspace root bucket.For the workspace root bucket, databricks workspace will interact with databricks credential to make sure databricks has access to it and able t...

  • 1 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels