cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to get data from Splunk on daily basis?

Arch_dbxlearner
New Contributor III

I am finding the ways to get the data to Databricks from Splunk (similar to other data sources like S3, Kafka, etc.,). I have received a suggestion to use the Databricks add-on to get/put the data from/to Splunk. To pull the data from Databricks to Splunk is easy via setting up this add-on at Splunk side.

But to push the data from Splunk to Databricks, I don't find any documentation in setting up the add-on. If anyone can help me with procedure of setting up this add-on at Databricks side, it will helpful for me to proceed on this. I have got another set of procedure to pull the data from Splunk to Databricks via a github document - here

The plan is to send the data from Splunk to Databricks on daily basis and build a dashboards on top those data. As it is daily basis data, it could be high volume of data. I would like to know the limitation of sending the data in the respective tools.

I tried to check in Databricks document, but I could not find any information with respect to the communication with Splunk.

Could anyone please help me on finding the best way to send the Splunk data to Databricks?

5 REPLIES 5

shan_chandra
Databricks Employee
Databricks Employee

Hi @shan_chandra ,

Already I have gone through the post which you have shared above. It is mentioned that the add-on is bi-directional so the communication between Splunk and Databricks can be done.

My requirement is the data to be sent from Splunk to Databricks. I need only one directional activity, where the Splunk data to be used in Databricks and do further activity on Databricks.

So my doubt is where the add-on should be installed. I am going to push the data from Splunk to Databricks. I am aware that it requires HEC but ideally where my Databricks add-on should be placed.

The name says that it is "Databricks add-on for Splunk". I would like to know the process to setup this add-on to push the data only from Splunk to Databricks.

Could you please help me on this? 

@Arch_dbxlearner - we can limit access to the user only to read the data from Splunk into Databricks. Please refer below. 

https://github.com/databrickslabs/splunk-integration/blob/master/docs/markdown/Splunk%20DB%20Connect...

In my experience with the Splunk add-on,  it is typically used to pull Databricks data into Splunk, not to push.   If the data sets are small then it could probably push as well, but I think you'd have to write some sort of Splunk map loop to issue INSERT statements against Databricks.

It would probably be more manageable to use this approach,  https://github.com/databrickslabs/splunk-integration/blob/master/docs/markdown/Databricks%20-%20Pull....

This may also provide guidance:  https://registry.terraform.io/modules/databricks/examples/databricks/latest/examples/adb-splunk

Another idea (if you need to do small lookups, not bulk transfer) .... what about using Splunk's splunk-sdk to create a notebook function that hits Splunk via REST API?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group