cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to push Cluster Logs to Elastic Search?

User15813097110
New Contributor III
 
1 REPLY 1

User15813097110
New Contributor III

We can use the below steps to push Cluster Logs to Elastic Search:

1. Download the log4j-elasticsearch-java-api repo and build the jar file:

git clone https://github.com/Downfy/log4j-elasticsearch-java-api.git
cd log4j-elasticsearch-java-api/
mvn clean install -Dmaven.test.skip=true

2. Go to the Libraries tab of the cluster and upload the jar file (located at /target/log4j-elasticsearch-1.0.0-RELEASE.jar). Now the jar file should be saved to a DBFS location something like this:

dbfs:/FileStore/jars/9294d79f_8d33_4270_9a52_cc36c2651220-log4j_elasticsearch_1_0_0_RELEASE-970d7.jar

3. Zip the 30 dependent jar files at /target/lib into one file dependency.zip and copy it to DBFS. For example, you can use Databricks CLI to upload the files to DBFS:

dbfs mkdirs dbfs:/dilip/elkzip/
dbfs mkdirs dbfs:/dilip/elkjar/
dbfs cp Desktop/log4j-elasticsearch-java-api/target/dependency.zip dbfs:/dilip/elkzip/

4. Unzip the jar files to another DBFS location using the followig notebook command:

%sh unzip /dbfs/dilip/elkzip/dependency.zip -d /dbfs/dilip/elkjar/

5. Run the following Python notebook command to create the init script (please change the file name and path as appropriate):

%python
dbutils.fs.put("/dilip/init-scripts/setLog4jProperties.sh","""
#!/bin/bash
set -e
cp /dbfs/FileStore/jars/9294d79f_8d33_4270_9a52_cc36c2651220-log4j_elasticsearch_1_0_0_RELEASE-970d7.jar /databricks/jars/
cp /dbfs/dilip/elkjar/*.jar /databricks/jars/
cat << EOF >> /databricks/spark/dbconf/log4j/driver/log4j.properties
# RootLogger
log4j.rootLogger=INFO,stdout,elastic
# Logging Threshold
log4j.threshhold=ALL
#
# stdout
# Add *stdout* to root logger above if you want to use this
#
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
# ElasticSearch log4j appender for application
log4j.appender.elastic=com.letfy.log4j.appenders.ElasticSearchClientAppender
log4j.appender.elastic.elasticHost=internal-vip-elasticsearh-int-dev-7645241416321.us-west-2.elb.amazonaws.com
log4j.appender.elastic.hostName=my_laptop
log4j.appender.elastic.applicationName=elkdemo
log4j.appender.elastic.elasticIndex=logging-elk
log4j.appender.elastic.elasticType=logging
EOF
""", True)

6. Go to the cluster -> "Advanced Options"->"Init Scripts", and then follow the steps outlined in section "Configure a cluster-scoped init script using the UI" in the following documentation (in our case, the path to the init-script is dbfs:/dilip/init-scripts/setLog4jProperties.sh)

https://docs.databricks.com/clusters/init-scripts.html#cluster-scoped-init-scripts

7. Restart the cluster and check the driver logs. Logs should now be available on your Elastic Search.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.