cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

JordanYaker
by Contributor
  • 7040 Views
  • 7 replies
  • 8 kudos

Resolved! Is anyone else experiencing intermittent "Failure starting REPL" errors with PySpark Jobs?

I have a Multi-Task Job that is running a bunch of PySpark notebooks and about 30-60% of the time, my jobs fail with the following error:I haven't seen any consistency with this error. I've had as many as all of the tasks in the job giving this error...

image.png
  • 7040 Views
  • 7 replies
  • 8 kudos
Latest Reply
James_Cole
New Contributor III
  • 8 kudos

Hi. Did you ever got a resolution to this problem outside of rolling back to 10.4? I have recently moved some workloads over to runtime 11.3 and am experiencing intermittent "repl did not start in 30 seconds." errors.I have increased the repl timeout...

  • 8 kudos
6 More Replies
andreas9898
by New Contributor II
  • 3650 Views
  • 3 replies
  • 5 kudos

Getting error with spark-sftp, no such file

In a databricks cluster with Scala 2.1.1 I am trying to read a file into a spark data frame using the following code.val df = spark.read .format("com.springml.spark.sftp") .option("host", "*") .option("username", "*") .option("password", "*")...

  • 3650 Views
  • 3 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Hi @Andreas P​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 5 kudos
2 More Replies
KVNARK
by Honored Contributor II
  • 6584 Views
  • 8 replies
  • 28 kudos

Resolved! Can we use Databricks or code in data bricks without learning Pyspark in depth which is used for ETL purpose and data engineering perspective.

Can we use Databricks or code in data bricks without learning Pyspark in depth which is used for ETL purpose and data engineering perspective. can someone throw some light on this. Currently learning Pyspark (basics of Pythion in handling the data) a...

  • 6584 Views
  • 8 replies
  • 28 kudos
Latest Reply
KVNARK
Honored Contributor II
  • 28 kudos

Thanks All for your valuable suggestions!

  • 28 kudos
7 More Replies
alxsbn
by Contributor
  • 1112 Views
  • 0 replies
  • 2 kudos

Terraform x Databricks error INVALID_STATE subscription disabled

Hello,I just bootstrap a new Databricks EC2 on an AWS account with Terraform. Priori dependencies seems OK on my side (network, root storage, credentials configuration). I'm referring mainly to this guide and of course pages related to each Databrick...

  • 1112 Views
  • 0 replies
  • 2 kudos
VN11111
by New Contributor III
  • 11204 Views
  • 5 replies
  • 6 kudos

Resolved! ERROR: Some streams terminated before this command could finish!

I have a databricks notebook which is to read stream from Azure Event Hub.My code does the following:1.Configure path for Eventhubs2.Read Streamdf_read_stream = (spark.readStream .format("eventhubs") .options(**conf)...

  • 11204 Views
  • 5 replies
  • 6 kudos
Latest Reply
guru1
New Contributor II
  • 6 kudos

I am also facing same issue , using Cluster11.3 LTS (includes Apache Spark 3.3.0, Scala 2.12) liberary : com.microsoft.azure:azure-eventhubs-spark_2.12:2.3.21Please help me for sameconf = {}conf["eventhubs.connectionString"] = "Endpoint=sb://xxxx.ser...

  • 6 kudos
4 More Replies
huyd
by New Contributor III
  • 1626 Views
  • 0 replies
  • 4 kudos

Optimizing a batch load process, reading with the JDBC driver

I am doing a batch load, using the JDBC driver from a database table. I am noticing in Sparkui, that there is both memory and disk spill, but only on one executor. I am also, noticing that when trying to use the JDBC parallel read, it seems to run sl...

  • 1626 Views
  • 0 replies
  • 4 kudos
magnus778
by New Contributor III
  • 2774 Views
  • 2 replies
  • 4 kudos

Resolved! Error writing parquet to specific container in Azure Data Lake

I'm retrieving two files from container1, transforming them and merging before writing to a container2 within the same Storage Account in Azure. I'm mounting container1, unmouting and mounting countainer2 before writing. My code for writing the parqu...

  • 2774 Views
  • 2 replies
  • 4 kudos
Latest Reply
Pat
Honored Contributor III
  • 4 kudos

Hi @Magnus Asperud​ ,1 mounting container12 you should persist the data somewhere, creating df doesnt mean that you are reading data from container and have it accessible after unmounting. Make sure to store this merged data somewhere. Not sure if th...

  • 4 kudos
1 More Replies
Constantino
by New Contributor III
  • 2773 Views
  • 3 replies
  • 4 kudos

Resolved! cannot list all tokens with account admin

I'm trying to list all tokens (both user and service principal) for a given workspace; using an Account level admin I've tried both the CLI as well as the API endpoint to list tokens, however each time, only the admin's tokens are returned.I've confi...

  • 2773 Views
  • 3 replies
  • 4 kudos
Latest Reply
Pat
Honored Contributor III
  • 4 kudos

Great that I could help

  • 4 kudos
2 More Replies
ckwan48
by New Contributor III
  • 4865 Views
  • 2 replies
  • 4 kudos

Date schema issues with pyspark dataframe creation

I'm having some issues with creating a dataframe with a date column. Could I know what is wrong?from pyspark.sql import SparkSession from pyspark.sql.types import StructType from pyspark.sql.types import DateType, FloatType spark = SparkSession.bui...

  • 4865 Views
  • 2 replies
  • 4 kudos
Latest Reply
ckwan48
New Contributor III
  • 4 kudos

Hi @Kaniz Fatma​,I actually changed the date format to 'M/d/Y' and it didn't throw any errors. I found in my csv file that it had dates like '3/1/2022'. Could that be the issue? But some dates also were like '12/1/2022. So I'm kind of confused.

  • 4 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels