cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Log4J Custom Filter Not Working

laurencewells
New Contributor III

Hi All,

Hoping you can help. I am looking to set up a custom logging process that captures application ETL logs and Streaming logs

I have set up multiple custom logging appenders using the guide here:

https://kb.databricks.com/clusters/overwrite-log4j-logs.html

This is working and the logs are collecting as expected however the filtering on the custom logging is not working for the customStream appender. It is aiming to use the same syntax as the databricks referenced filter further up in relation to ADLS transport

Anyone got any thoughts on what's going on or a way to test this ?

log4j.appender.customStream.filter.str=com.databricks.logging.DatabricksLogFilter

log4j.appender.customStream.filter.str.LoggerName=org.apache.spark.sql.execution.streaming.MicroBatchExecution

log4j.appender.customStream.filter.str.StringToMatch=progress:

log4j.appender.customStream.filter.str.AcceptOnMatch=true

log4j.appender.customStream.filter.def=com.databricks.logging.DatabricksLogFilter.DenyAllFilter

Full Log4j Properties file

# The driver logs will be divided into three different logs: stdout, stderr, and log4j. The stdout
# and stderr are rolled using StdoutStderrRoller. The log4j logs are again split into two: public
# and private. Stdout, stderr, and only the public log4j logs are shown to the customers.
log4j.rootCategory=INFO, publicFile
 
# Use the private logger method from the ConsoleLogging trait to log to the private file.
# All other logs will go to the public file.
log4j.logger.privateLog=INFO, privateFile
log4j.additivity.privateLog=false
 
# privateFile
log4j.appender.privateFile=com.databricks.logging.RedactionRollingFileAppender
log4j.appender.privateFile.layout=org.apache.log4j.PatternLayout
log4j.appender.privateFile.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
log4j.appender.privateFile.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.privateFile.rollingPolicy.FileNamePattern=logs/%d{yyyy-MM-dd-HH}.log.gz
log4j.appender.privateFile.rollingPolicy.ActiveFileName=logs/active.log
 
# publicFile
log4j.appender.publicFile=com.databricks.logging.RedactionRollingFileAppender
log4j.appender.publicFile.layout=org.apache.log4j.PatternLayout
log4j.appender.publicFile.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
log4j.appender.publicFile.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.publicFile.rollingPolicy.FileNamePattern=logs/log4j-%d{yyyy-MM-dd-HH}.log.gz
log4j.appender.publicFile.rollingPolicy.ActiveFileName=logs/log4j-active.log
 
# Increase log level of NewHadoopRDD so it doesn't print every split.
# (This is really because Parquet prints the whole schema for every part.)
log4j.logger.org.apache.spark.rdd.NewHadoopRDD=WARN
 
# Enable logging for Azure Data Lake (SC-14894)
log4j.logger.com.microsoft.azure.datalake.store=DEBUG
log4j.logger.com.microsoft.azure.datalake.store.HttpTransport=DEBUG
log4j.logger.com.microsoft.azure.datalake.store.HttpTransport.tokens=DEBUG
# We also add custom filter to remove excessive logging of successful http requests
log4j.appender.publicFile.filter.adl=com.databricks.logging.DatabricksLogFilter
log4j.appender.publicFile.filter.adl.LoggerName=com.microsoft.azure.datalake.store.HttpTransport
log4j.appender.publicFile.filter.adl.StringToMatch=HTTPRequest,Succeeded
log4j.appender.publicFile.filter.adl.AcceptOnMatch=false
 
# RecordUsage Category
log4j.logger.com.databricks.UsageLogging=INFO, usage
log4j.additivity.com.databricks.UsageLogging=false
log4j.appender.usage=org.apache.log4j.rolling.DatabricksRollingFileAppender
log4j.appender.usage.layout=org.apache.log4j.PatternLayout
log4j.appender.usage.layout.ConversionPattern=%m%n
log4j.appender.usage.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.usage.rollingPolicy.FileNamePattern=logs/%d{yyyy-MM-dd-HH}.usage.json.gz
log4j.appender.usage.rollingPolicy.ActiveFileName=logs/usage.json
 
# Product Logs
log4j.logger.com.databricks.ProductLogging=INFO, product
log4j.additivity.com.databricks.ProductLogging=false
log4j.appender.product=org.apache.log4j.rolling.DatabricksRollingFileAppender
log4j.appender.product.layout=org.apache.log4j.PatternLayout
log4j.appender.product.layout.ConversionPattern=%m%n
log4j.appender.product.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.product.rollingPolicy.FileNamePattern=logs/%d{yyyy-MM-dd-HH}.product.json.gz
log4j.appender.product.rollingPolicy.ActiveFileName=logs/product.json
 
# Lineage Logs
log4j.logger.com.databricks.LineageLogging=INFO, lineage
log4j.additivity.com.databricks.LineageLogging=false
log4j.appender.lineage=org.apache.log4j.rolling.DatabricksRollingFileAppender
log4j.appender.lineage.layout=org.apache.log4j.PatternLayout
log4j.appender.lineage.layout.ConversionPattern=%m%n
log4j.appender.lineage.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.lineage.rollingPolicy.FileNamePattern=logs/%d{yyyy-MM-dd-HH}.lineage.json.gz
log4j.appender.lineage.rollingPolicy.ActiveFileName=logs/lineage.json
log4j.appender.lineage.encoding=UTF-8
 
# Metrics Logs
log4j.logger.com.databricks.MetricsLogging=INFO, metrics
log4j.additivity.com.databricks.MetricsLogging=false
log4j.appender.metrics=org.apache.log4j.rolling.DatabricksRollingFileAppender
log4j.appender.metrics.layout=org.apache.log4j.PatternLayout
log4j.appender.metrics.layout.ConversionPattern=%m%n
log4j.appender.metrics.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.metrics.rollingPolicy.FileNamePattern=logs/%d{yyyy-MM-dd-HH}.metrics.json.gz
log4j.appender.metrics.rollingPolicy.ActiveFileName=logs/metrics.json
log4j.appender.metrics.encoding=UTF-8
 
# Ignore messages below warning level from Jetty, because it's a bit verbose
#log4j.logger.org.eclipse.jetty=WARN
 
log4j.appender.custom=com.databricks.logging.RedactionRollingFileAppender
log4j.appender.custom.layout=org.apache.log4j.PatternLayout
log4j.appender.custom.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} || %p || %c{1}: %m%n
log4j.appender.custom.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.custom.rollingPolicy.FileNamePattern=/tmp/custom/logs/log4j-custom-logs-%d{yyyy-MM-dd-HH}.log.gz
log4j.appender.custom.rollingPolicy.ActiveFileName=/tmp/custom/logs/log4j-custom-file-active.log
log4j.logger.appLogger= INFO, custom
log4j.additivity.appLogger=false
 
log4j.logger.org.apache.spark.sql.execution.streaming.MicroBatchExecution=INFO, customStream
log4j.additivity.org.apache.spark.sql.execution.streaming.MicroBatchExecution=false
log4j.appender.customStream=com.databricks.logging.RedactionRollingFileAppender
log4j.appender.customStream.layout=org.apache.log4j.PatternLayout
log4j.appender.customStream.layout.ConversionPattern=%m%n
log4j.appender.customStream.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.customStream.rollingPolicy.FileNamePattern=/tmp/custom/logs/log4j-customStream-logs-%d{yyyy-MM-dd-HH}.log.gz
log4j.appender.customStream.rollingPolicy.ActiveFileName=/tmp/custom/logs/log4j-customStream-file-active.log
log4j.appender.customStream.filter.str=com.databricks.logging.DatabricksLogFilter
log4j.appender.customStream.filter.str.LoggerName=org.apache.spark.sql.execution.streaming.MicroBatchExecution
log4j.appender.customStream.filter.str.StringToMatch=progress:
log4j.appender.customStream.filter.str.AcceptOnMatch=true
log4j.appender.customStream.filter.def=com.databricks.logging.DatabricksLogFilter.DenyAllFilter
 
log4j.appender.customJson=com.databricks.logging.RedactionRollingFileAppender
log4j.appender.customJson.layout=org.apache.log4j.PatternLayout
log4j.appender.customJson.layout.ConversionPattern=%m%n
log4j.appender.customJson.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.customJson.rollingPolicy.FileNamePattern=/tmp/custom/logs/log4j-customJSON-logs-%d{yyyy-MM-dd-HH}.json.gz
log4j.appender.customJson.rollingPolicy.ActiveFileName=/tmp/custom/logs/log4j-customJSON-file-active.json
log4j.logger.appLoggerJson = INFO, customJson
log4j.additivity.appLoggerJson=false

which

1 ACCEPTED SOLUTION

Accepted Solutions

Hey unfortunately not. That is a blog about log4j vulnerability. Fortunately databricks are upgrading to log4j v2 in runtime 11 so its a mute point now.​ v2 has much better filters etc

View solution in original post

4 REPLIES 4

Anonymous
Not applicable

Hello @Laurence Wells​. It's great to meet you and thank you for your question! We'll give the members a chance to answer before we come back to this. Thanks for your patience!

Kaniz
Community Manager
Community Manager

Hi @Laurence Wells​ , Please go through the blog.

Anonymous
Not applicable

Hey there @Laurence Wells​ 

Hope you are doing great.

Does @Kaniz Fatma​ 's response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?

Thanks!

Hey unfortunately not. That is a blog about log4j vulnerability. Fortunately databricks are upgrading to log4j v2 in runtime 11 so its a mute point now.​ v2 has much better filters etc

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.