cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

What are the different ways to pull the log data from Splunk to Databricks?

Arch_dbxlearner
New Contributor III

Hi,

I have recently started Splunk Integration with Databricks. Basically I am trying to ingest the data from Splunk to Databricks. I have gone through the documentation regarding Splunk Integration. There are some basic information about the integration but I am looking for something else which is not available in the document.

I would like to know the ways which are possible with data ingestion from Splunk.

- Can we send the log data directly from Splunk to Databricks?
- Do any intermediate tools/api required for the communication? If it's mandatory, then what are the possible tools/api?
- Splunk have event data and metric data. Is it possible to pick both these type of data by Databricks?

Could anyone please help me out with these queries?

 

10 REPLIES 10

Kaniz
Community Manager
Community Manager

Hi @Arch_dbxlearner,

As for your other questions:

  • You can send log data directly from Splunk to Databricks without intermediate tools or APIs. However, this may be inefficient or unreliable if you have large volumes of log data or complex transformations.
  • Yes, there are different types of data that you can pick from Splunk: event data and metric data. Event data are records of what happened in your system or application over time. Metric data are measurements of the health or performance of your system or application at a specific time.
  • Yes, it is possible to pick both event data and metric data by Databricks using the same methods as above: add-on for Splunk, intermediate tool or API, third-party service or tool.

I hope this helps you understand how to integrate Splunk with Databricks better. If you have any more questions, feel free to ask.

Thankyou @Kaniz for the clear explanation. 

I have another set of questions. Please provide your suggestion on these as well.

  • I have gone through a tool called "Open Telemetry", which collects the logs, metrics, etc.,. Can this tool be used as an intermediate between Databricks and Splunk?
  • Instead of having the Splunk to collect the event data from our application, is there any way to send the application/system logs directly to Databricks?
  • Is it possible to send the complete raw data and/or customised filter data from application to Databricks? 

Thankyou!

Thankyou @Kaniz .

Currently I am planning to check the possible ways to send the sample data to Databricks from Splunk without any third party tool's intervention.

Let me play around with those and get back to you if I need any guidance at any place.

Thanks once again!

Arch_dbxlearner
New Contributor III

@Kaniz 

You have mentioned that Databricks Add-on for Splunk, is bidirectional. Do we need to install this app on Databricks itself, to fetch the data from Splunk?

I tried to check this add-on on Databricks Marketplace but I could not find this. Can you please let me know the process to install the add-on?

I am looking to push the data from Splunk to Databricks and do some process and activity on daily basis. Could you please suggest me on this?

Hi @Kaniz 

Can you please guide me on this?

Hi @Arch_dbxlearner, You can send data from Splunk to Databricks without using any third-party tools by leveraging the Da.... This add-on is a bidirectional connector that allows you to run queries and execute actions in Datab.... It can also push data to Splunk via its HTTP Event Collector (HEC).

 

Here are the steps you can follow:

 

Install the Databricks Add-on for Splunk: You can find the add-on on the Databricks Labs GitHub page. Follow the installation instructions provided.

Configure the Add-on: After installing the add-on, you’ll need to configure it to connect to your Databricks workspace. This typically involves providing your Databricks workspace URL and access token.

Send Data: Once the add-on is installed and configured, you can use it to send data from Splunk to Databricks. This can be done by running queries, notebooks, or jobs in Databricks from within Splunk.

 

Please note that while this method doesn’t require any third-party tools, it does require the instal....

 

If you encounter any issues or need further assistance, feel free to ask. I’m here to help! 😊

Hi @Kaniz 

I have gone through the github page of Databricks - Splunk integration. In the architecture diagram it is mentioned with 3 sections.

  1. Setting up Databricks add-on for Splunk
  2. Configuring Splunk DB Connect app
  3. Creating Notebook for push and pull data from Splunk

My requirement is only to fetch the data from Splunk and put in Databricks to do analysis and create dashboard. so I assumed, for my usecase 3rd option is the method to be done and I have followed the github page - here.

I have installed the databricks cli, created a secret scope to save Splunk credentials. Now I am working on the Notebook part to create Python code to fetch data.

  • Am I going in the right way?
  • Is it sufficient to follow and setup only the 3rd method, if I need only one way communication which is from Splunk to Databricks?
  • Also I am referring this document here for Python script of my Notebook. Can I use this?

Could you please guide me on this to proceed further?

@Kaniz Can you please guide me on this?

Hi @Kaniz 

Still I am awaiting for your response on this. Can you please go through my above reply and guide me accordingly?

Thank you!