cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Ravikumashi
by Contributor
  • 4914 Views
  • 8 replies
  • 0 kudos

failed to initialise azure-event-hub with azure AAD(service principal)

We have been trying to authenticate azure-event-hub with azure AD(service principal) instead of shared access key(connection string) and read events from azure-event-hub and it is failing to initialise azure-event-hubs. And throwing no such method ex...

Error message full
  • 4914 Views
  • 8 replies
  • 0 kudos
Latest Reply
Ravikumashi
Contributor
  • 0 kudos

@swathi-dataops I have added ServicePrincipalCredentialsAuth and ServicePrincipalAuthBase as a normal classes instead of creating a separate jar for these 2 classes and packaged them as a part of my project jar.And used the below code for configuring...

  • 0 kudos
7 More Replies
sparkstreaming
by New Contributor III
  • 4272 Views
  • 5 replies
  • 4 kudos

Resolved! Missing rows while processing records using foreachbatch in spark structured streaming from Azure Event Hub

I am new to real time scenarios and I need to create a spark structured streaming jobs in databricks. I am trying to apply some rule based validations from backend configurations on each incoming JSON message. I need to do the following actions on th...

  • 4272 Views
  • 5 replies
  • 4 kudos
Latest Reply
Rishi045
New Contributor III
  • 4 kudos

Were you able to achieve any solutions if yes please can you help with it.

  • 4 kudos
4 More Replies
Swaroop
by New Contributor
  • 604 Views
  • 0 replies
  • 0 kudos

How to receive data from azure event hub in parquet ?

import asyncioimport osfrom azure.eventhub.aio import EventHubConsumerClientCONNECTION_STR = "Connection_string"EVENTHUB_NAME = "event_hub"async def on_event(partition_context, event):    # Put your code here.    # If the operation is i/o intensive, ...

  • 604 Views
  • 0 replies
  • 0 kudos
Sandesh87
by New Contributor III
  • 3176 Views
  • 4 replies
  • 2 kudos

spark-streaming read from specific event hub partition

The azure event hub "my_event_hub" has a total of 5 partitions ("0", "1", "2", "3", "4")The readstream should only read events from partitions "0" and "4"event hub configuration as streaming source:-val name = "my_event_hub" val connectionString = "m...

  • 3176 Views
  • 4 replies
  • 2 kudos
Latest Reply
keshav
New Contributor II
  • 2 kudos

I tried using below snippet to receive messages only from partition id=0ehName = "<<EVENT-HUB-NAME>>"   # Create event position for partition 0 positionKey1 = { "ehName": ehName, "partitionId": 0 }   eventPosition1 = { "offset": "@latest", ...

  • 2 kudos
3 More Replies
Phani1
by Valued Contributor
  • 4490 Views
  • 2 replies
  • 0 kudos

Execute tasks parallel to process multiple files parallel

Hi all, If we have multiple tasks under the job, How to invoke a specific task under a job.Do we have any API to invoke Job and its specific tasks instead of Job.Use case: When we receive multiple messages from the event hub, each underlying task in ...

  • 4490 Views
  • 2 replies
  • 0 kudos
Latest Reply
Phani1
Valued Contributor
  • 0 kudos

Thanks for your response, My question is ,if we have multiple tasks in a job ,How can we invoke specific task, I can see API to invoke the job but not a particular task in it. Kindly find attachment for your reference.

  • 0 kudos
1 More Replies
RengarLee
by Contributor
  • 3050 Views
  • 5 replies
  • 0 kudos

Resolved! How to improve Spark Streaming writer Input Rate and Processing rate?

Hi!I have many questions about Spark Streaming and Evnethub。Can you help me?Q1:How to improve Spark Streaming writer Input Rate and Processing rate?I connect Azure Eventhubs using Spark Streaming(Azure Databricks), but I found if I use display, this ...

  • 3050 Views
  • 5 replies
  • 0 kudos
Latest Reply
RengarLee
Contributor
  • 0 kudos

setMaxEventsPerTrigger not equal to numInputRow is my problem

  • 0 kudos
4 More Replies
Erik
by Valued Contributor II
  • 2796 Views
  • 6 replies
  • 8 kudos

Expected latency / batch duration for a simple streaming job?

What are "reasonable"/"normal" batch durations for easy (no real processing, just adding a few simple fields) streaming jobs into/from delta lake? We have set up a simple test case here where we are streaming from azure event hub generating a new mes...

  • 2796 Views
  • 6 replies
  • 8 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 8 kudos

Hi @Erik Parmann​  , Just a friendly follow-up. Do you still need help, or does my response help you to find the solution? Please let us know.

  • 8 kudos
5 More Replies
Jreco
by Contributor
  • 6569 Views
  • 14 replies
  • 3 kudos

Event hub streaming improve processing rate

Hi all,I'm working with event hubs and data bricks to process and enrich data in real-time.Doing a "simple" test, I'm getting some weird values (input rate vs processing rate) and I think I'm losing data:If you can see, there is a peak with 5k record...

image image
  • 6569 Views
  • 14 replies
  • 3 kudos
Latest Reply
jose_gonzalez
Moderator
  • 3 kudos

hi @Jhonatan Reyes​ ,How many Event hubs partitions are you readying from? your micro-batch takes a few milliseconds to complete, which I think is good time, but I would like to undertand better what are you trying to improve here.Also, in this case ...

  • 3 kudos
13 More Replies
Labels