<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to push Cluster Logs to Elastic Search? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-push-cluster-logs-to-elastic-search/m-p/26993#M18910</link>
    <description>&lt;P&gt;We can use the below steps to push Cluster Logs to Elastic Search:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;1. Download the log4j-elasticsearch-java-api repo and build the jar file:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;git clone &lt;A href="https://github.com/Downfy/log4j-elasticsearch-java-api.git" target="test_blank"&gt;https://github.com/Downfy/log4j-elasticsearch-java-api.git&lt;/A&gt;
cd log4j-elasticsearch-java-api/
mvn clean install -Dmaven.test.skip=true&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;2. Go to the Libraries tab of the cluster and upload the jar file (located at /target/log4j-elasticsearch-1.0.0-RELEASE.jar). Now the jar file should be saved to a DBFS location something like this: &lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;dbfs:/FileStore/jars/9294d79f_8d33_4270_9a52_cc36c2651220-log4j_elasticsearch_1_0_0_RELEASE-970d7.jar&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;3. Zip the 30 dependent jar files at /target/lib into one file dependency.zip and copy it to DBFS. For example, you can use Databricks CLI to upload the files to DBFS:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;dbfs mkdirs dbfs:/dilip/elkzip/
dbfs mkdirs dbfs:/dilip/elkjar/
dbfs cp Desktop/log4j-elasticsearch-java-api/target/dependency.zip dbfs:/dilip/elkzip/&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;4. Unzip the jar files to another DBFS location using the followig notebook command:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;%sh unzip /dbfs/dilip/elkzip/dependency.zip -d /dbfs/dilip/elkjar/&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;5. Run the following Python notebook command to create the init script (please change the file name and path as appropriate):&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;%python
dbutils.fs.put("/dilip/init-scripts/setLog4jProperties.sh","""
#!/bin/bash
set -e
cp /dbfs/FileStore/jars/9294d79f_8d33_4270_9a52_cc36c2651220-log4j_elasticsearch_1_0_0_RELEASE-970d7.jar /databricks/jars/
cp /dbfs/dilip/elkjar/*.jar /databricks/jars/
cat &amp;lt;&amp;lt; EOF &amp;gt;&amp;gt; /databricks/spark/dbconf/log4j/driver/log4j.properties
# RootLogger
log4j.rootLogger=INFO,stdout,elastic
# Logging Threshold
log4j.threshhold=ALL
#
# stdout
# Add *stdout* to root logger above if you want to use this
#
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
# ElasticSearch log4j appender for application
log4j.appender.elastic=com.letfy.log4j.appenders.ElasticSearchClientAppender
log4j.appender.elastic.elasticHost=internal-vip-elasticsearh-int-dev-7645241416321.us-west-2.elb.amazonaws.com
log4j.appender.elastic.hostName=my_laptop
log4j.appender.elastic.applicationName=elkdemo
log4j.appender.elastic.elasticIndex=logging-elk
log4j.appender.elastic.elasticType=logging
EOF
""", True)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;6. Go to the cluster -&amp;gt; "Advanced Options"-&amp;gt;"Init Scripts", and then follow the steps outlined in section "Configure a cluster-scoped init script using the UI" in the following documentation (in our case, the path to the init-script is dbfs:/dilip/init-scripts/setLog4jProperties.sh)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/clusters/init-scripts.html#cluster-scoped-init-scripts" alt="https://docs.databricks.com/clusters/init-scripts.html#cluster-scoped-init-scripts" target="_blank"&gt;https://docs.databricks.com/clusters/init-scripts.html#cluster-scoped-init-scripts&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;7. Restart the cluster and check the driver logs. Logs should now be available on your Elastic Search.&lt;/P&gt;</description>
    <pubDate>Fri, 07 May 2021 14:52:09 GMT</pubDate>
    <dc:creator>User15813097110</dc:creator>
    <dc:date>2021-05-07T14:52:09Z</dc:date>
    <item>
      <title>How to push Cluster Logs to Elastic Search?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-push-cluster-logs-to-elastic-search/m-p/26992#M18909</link>
      <description />
      <pubDate>Fri, 07 May 2021 14:48:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-push-cluster-logs-to-elastic-search/m-p/26992#M18909</guid>
      <dc:creator>User15813097110</dc:creator>
      <dc:date>2021-05-07T14:48:09Z</dc:date>
    </item>
    <item>
      <title>Re: How to push Cluster Logs to Elastic Search?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-push-cluster-logs-to-elastic-search/m-p/26993#M18910</link>
      <description>&lt;P&gt;We can use the below steps to push Cluster Logs to Elastic Search:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;1. Download the log4j-elasticsearch-java-api repo and build the jar file:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;git clone &lt;A href="https://github.com/Downfy/log4j-elasticsearch-java-api.git" target="test_blank"&gt;https://github.com/Downfy/log4j-elasticsearch-java-api.git&lt;/A&gt;
cd log4j-elasticsearch-java-api/
mvn clean install -Dmaven.test.skip=true&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;2. Go to the Libraries tab of the cluster and upload the jar file (located at /target/log4j-elasticsearch-1.0.0-RELEASE.jar). Now the jar file should be saved to a DBFS location something like this: &lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;dbfs:/FileStore/jars/9294d79f_8d33_4270_9a52_cc36c2651220-log4j_elasticsearch_1_0_0_RELEASE-970d7.jar&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;3. Zip the 30 dependent jar files at /target/lib into one file dependency.zip and copy it to DBFS. For example, you can use Databricks CLI to upload the files to DBFS:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;dbfs mkdirs dbfs:/dilip/elkzip/
dbfs mkdirs dbfs:/dilip/elkjar/
dbfs cp Desktop/log4j-elasticsearch-java-api/target/dependency.zip dbfs:/dilip/elkzip/&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;4. Unzip the jar files to another DBFS location using the followig notebook command:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;%sh unzip /dbfs/dilip/elkzip/dependency.zip -d /dbfs/dilip/elkjar/&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;5. Run the following Python notebook command to create the init script (please change the file name and path as appropriate):&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;%python
dbutils.fs.put("/dilip/init-scripts/setLog4jProperties.sh","""
#!/bin/bash
set -e
cp /dbfs/FileStore/jars/9294d79f_8d33_4270_9a52_cc36c2651220-log4j_elasticsearch_1_0_0_RELEASE-970d7.jar /databricks/jars/
cp /dbfs/dilip/elkjar/*.jar /databricks/jars/
cat &amp;lt;&amp;lt; EOF &amp;gt;&amp;gt; /databricks/spark/dbconf/log4j/driver/log4j.properties
# RootLogger
log4j.rootLogger=INFO,stdout,elastic
# Logging Threshold
log4j.threshhold=ALL
#
# stdout
# Add *stdout* to root logger above if you want to use this
#
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
# ElasticSearch log4j appender for application
log4j.appender.elastic=com.letfy.log4j.appenders.ElasticSearchClientAppender
log4j.appender.elastic.elasticHost=internal-vip-elasticsearh-int-dev-7645241416321.us-west-2.elb.amazonaws.com
log4j.appender.elastic.hostName=my_laptop
log4j.appender.elastic.applicationName=elkdemo
log4j.appender.elastic.elasticIndex=logging-elk
log4j.appender.elastic.elasticType=logging
EOF
""", True)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;6. Go to the cluster -&amp;gt; "Advanced Options"-&amp;gt;"Init Scripts", and then follow the steps outlined in section "Configure a cluster-scoped init script using the UI" in the following documentation (in our case, the path to the init-script is dbfs:/dilip/init-scripts/setLog4jProperties.sh)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/clusters/init-scripts.html#cluster-scoped-init-scripts" alt="https://docs.databricks.com/clusters/init-scripts.html#cluster-scoped-init-scripts" target="_blank"&gt;https://docs.databricks.com/clusters/init-scripts.html#cluster-scoped-init-scripts&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;7. Restart the cluster and check the driver logs. Logs should now be available on your Elastic Search.&lt;/P&gt;</description>
      <pubDate>Fri, 07 May 2021 14:52:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-push-cluster-logs-to-elastic-search/m-p/26993#M18910</guid>
      <dc:creator>User15813097110</dc:creator>
      <dc:date>2021-05-07T14:52:09Z</dc:date>
    </item>
  </channel>
</rss>

