cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

avnish26
by New Contributor III
  • 7231 Views
  • 4 replies
  • 8 kudos

Spark 3.3.0 connect kafka problem

I am trying to connect to my Kafka from spark but getting an error:Kafka Version: 2.4.1Spark Version: 3.3.0I am using jupyter notebook to execute the pyspark code below:```from pyspark.sql.functions import *from pyspark.sql.types import *#import libr...

  • 7231 Views
  • 4 replies
  • 8 kudos
Latest Reply
jose_gonzalez
Moderator
  • 8 kudos

Hi @avnish26, did you added the Jar files to the cluster? do you still have issues? please let us know

  • 8 kudos
3 More Replies
Jayanth746
by New Contributor III
  • 9806 Views
  • 10 replies
  • 4 kudos

Kafka unable to read client.keystore.jks.

Below is the error we have received when trying to read the stream Caused by: kafkashaded.org.apache.kafka.common.KafkaException: Failed to load SSL keystore /dbfs/FileStore/Certs/client.keystore.jksCaused by: java.nio.file.NoSuchFileException: /dbfs...

  • 9806 Views
  • 10 replies
  • 4 kudos
Latest Reply
mwoods
New Contributor III
  • 4 kudos

Ok, scrub that - the problem in my case was that I was using the 14.0 databricks runtime, which appears to have a bug relating to abfss paths here. Switching back to the 13.3 LTS release resolved it for me. So if you're in the same boat finding abfss...

  • 4 kudos
9 More Replies
Dhana
by New Contributor
  • 372 Views
  • 0 replies
  • 0 kudos

Databricks and Kafka connectivity

I am trying to read data from Kafka, which is installed on my local system. I am using Databricks Community Edition with a cluster version of 12.2. However, I am unable to read data from Kafka. My use case is to read data from Kafka installed on my l...

  • 372 Views
  • 0 replies
  • 0 kudos
AdamRink
by New Contributor III
  • 1088 Views
  • 3 replies
  • 0 kudos

Resolved! Apply Avro defaults when writing to Confluent Kafka

I have an avro schema for my Kafka topic. In that schema it has defaults. I would like to exclude the defaulted columns from databricks and just let them default as an empty array. Sample avro, trying to not provide the UserFields because I can't...

  • 1088 Views
  • 3 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Adam Rink​ , Please go through the following blog. Let me know if it helps.https://docs.databricks.com/spark/latest/structured-streaming/avro-dataframe.html#example-with-schema-registry

  • 0 kudos
2 More Replies
kris08
by New Contributor
  • 807 Views
  • 1 replies
  • 0 kudos

Kafka consumer groups in Databricks

I was trying to find information about configuring the consumer groups for kafka stream in databricks. By doing so I want to parallelize the stream and load it into databricks tables. Does the databricks handle this internally? If we can configure th...

  • 807 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 0 kudos

Hi, we have a few examples on stream processing using Kafka (https://docs.databricks.com/structured-streaming/kafka.html), there is no straight public document for Kafka consumer group creation. You can refer to https://kafka.apache.org/documentation...

  • 0 kudos
SS0201
by New Contributor II
  • 1962 Views
  • 4 replies
  • 0 kudos

Slow updates/upserts in Delta tables

When using Delta tables with DBR jobs or even with DLT pipelines, the upserts (especially updates) (on key and timestamp) are taking quite higher than expected time to update the files/tables data (~2 mins for even 1 record poll) (Inserts are lightni...

  • 1962 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Surya Agarwal​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so...

  • 0 kudos
3 More Replies
ramravi
by Contributor II
  • 1312 Views
  • 3 replies
  • 4 kudos

Issue while reading data from Kafka topic to Spark strutured streaming

py4j.security.Py4JSecurityException: Method public org.apache.spark.sql.streaming.DataStreamReader org.apache.spark.sql.SQLContext.readStream() is not whitelisted on class class org.apache.spark.sql.SQLContextI already disable acl for cluster using "...

  • 1312 Views
  • 3 replies
  • 4 kudos
Latest Reply
jose_gonzalez
Moderator
  • 4 kudos

Hi @Ravi Teja​,Just a friendly follow-up. Do you still need help? if you do, please share more details, like DBR version, standard or High concurrency cluster? etc

  • 4 kudos
2 More Replies
dheeraj2444
by New Contributor II
  • 1264 Views
  • 4 replies
  • 0 kudos

I am trying to write a data frame to Kafka topic with Avro schema for key and value using a schema registry URL. The to_avro function is not writing t...

I am trying to write a data frame to Kafka topic with Avro schema for key and value using a schema registry URL. The to_avro function is not writing to the topic and throwing an exception with code 40403 something. Is there an alternate way to do thi...

  • 1264 Views
  • 4 replies
  • 0 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 0 kudos

Hi,Could you please refer to https://github.com/confluentinc/kafka-connect-elasticsearch/issues/59 and let us know if this helps.

  • 0 kudos
3 More Replies
julie
by New Contributor III
  • 2011 Views
  • 5 replies
  • 3 kudos

Resolved! Scope creation in Databricks or Confluent?

Hello I am a newbie in this field and trying to access confluent kafka stream in Databricks Azure based on a beginner's video by Databricks. I have a free trial of Databricks cluster right now. When I run the below notebook, it errors out on line 5 o...

image
  • 2011 Views
  • 5 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

For testing, create without secret scope. It will be unsafe, but you can post secrets as strings in the notebook for testing. Here is the code which I used for loading data from confluent:inputDF = (spark .readStream .format("kafka") .option("kafka.b...

  • 3 kudos
4 More Replies
aiwithqasim
by Contributor
  • 1317 Views
  • 2 replies
  • 4 kudos

Resolved! How to Kafka configured on your PC with Databricks?

I'm working on the case to configure Kafka that is installed on my machine (Laptop) & I want to connect it with my Databricks account hosted on the AWS cloud.Secondly, I have CSV files that I want to use for real-time processing from Kafka to Databri...

  • 1317 Views
  • 2 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

For CSV, you need just to readStream in the notebook and append output to CSV using forEachBatch method.Your Kafka on PC needs to have the public address or you need to set AWS VPN and connect from your laptop to be in the same VPC as databricks.

  • 4 kudos
1 More Replies
architect
by New Contributor
  • 975 Views
  • 1 replies
  • 0 kudos

Does Databricks provide a mechanism to have rate limiting for receivers?

from pyspark.sql import SparkSession   scala_version = '2.12' spark_version = '3.3.0'   packages = [ f'org.apache.spark:spark-sql-kafka-0-10_{scala_version}:{spark_version}', 'org.apache.kafka:kafka-clients:3.2.1' ]   spark = SparkSession.bui...

  • 975 Views
  • 1 replies
  • 0 kudos
Latest Reply
Rajani
Contributor
  • 0 kudos

hi @Software Architect​  i dont think so

  • 0 kudos
Ajay-Pandey
by Esteemed Contributor III
  • 931 Views
  • 2 replies
  • 9 kudos

Kafka integration with Databricks

Hi allI want to integrate Kafka with databricks if anyone can share any doc or code it will help me a lot.Thanks in advance

  • 931 Views
  • 2 replies
  • 9 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 9 kudos

This is code that I am using to read from KafkainputDF = (spark .readStream .format("kafka") .option("kafka.bootstrap.servers", host) .option("kafka.ssl.endpoint.identification.algorithm", "https") .option("kafka.sasl.mechanism", "PLAIN") .option("ka...

  • 9 kudos
1 More Replies
AdamRink
by New Contributor III
  • 1074 Views
  • 2 replies
  • 6 kudos

How to limit batch size from Confluent Kafka

I have a large stream of data read from Confluent Kafka, 500+ millions of row. When I initialize the stream I cannot control the batch sizes that are read.I've tried setting options on the readstream - maxBytesPerTrigger, maxOffsetsPerTrigger, fetc...

  • 1074 Views
  • 2 replies
  • 6 kudos
Latest Reply
UmaMahesh1
Honored Contributor III
  • 6 kudos

Hi @Adam Rink​ Just checking for further info on your question. How are you deducing that the batch sizes are more than what you are providing as maxOffsetsPerTrigger ?

  • 6 kudos
1 More Replies
Jayanth746
by New Contributor III
  • 3072 Views
  • 2 replies
  • 2 kudos

Databricks <-> Kafka - SSL handshake failed

I am receiving SSL handshake error even though the trust-store I have created is based on server certificate and the fingerprint in the certificate matches the trust-store fingerprint.kafkashaded.org.apache.kafka.common.errors.SslAuthenticationExcept...

  • 3072 Views
  • 2 replies
  • 2 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 2 kudos

Hi @Jayanth Goulla​ , worth a try ,https://stackoverflow.com/questions/54903381/kafka-failed-authentication-due-to-ssl-handshake-failedDid you follow: https://docs.microsoft.com/en-us/azure/databricks/spark/latest/structured-streaming/kafka?

  • 2 kudos
1 More Replies
JesseLancaster
by New Contributor III
  • 1585 Views
  • 7 replies
  • 5 kudos

Hello, I&#39;m trying to use Databricks on Azure with a Spark structured streaming job and an having very mysterious issue. I boiled the job down it i...

Hello,I'm trying to use Databricks on Azure with a Spark structured streaming job and an having very mysterious issue.I boiled the job down it it's basics for testing, reading from a Kafka topic and writing to console in a forEachBatch.On local, ever...

  • 1585 Views
  • 7 replies
  • 5 kudos
Latest Reply
Kaniz
Community Manager
  • 5 kudos

Hi @JESSE LANCASTER​ , We haven’t heard from you since the last response from me​ ​, and I was checking back to see if my suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to others.Als...

  • 5 kudos
6 More Replies
Labels