合并多行数据:
-
# with an input plugin:
-
# you can also use this codec with an output.
-
input {
-
file {
-
codec => multiline {
-
charset => ... # string, one of ["ASCII-8BIT", "Big5", "Big5-HKSCS", "Big5-UAO", "CP949", "Emacs-Mule", "EUC-JP", "EUC-KR", "EUC-TW", "GB18030", "GBK", "ISO-8859-1", "ISO-8859-2", "ISO-8859-3", "ISO-8859-4", "ISO-8859-5", "ISO-8859-6", "ISO-8859-7", "ISO-8859-8", "ISO-8859-9", "ISO-8859-10", "ISO-8859-11", "ISO-8859-13", "ISO-8859-14", "ISO-8859-15", "ISO-8859-16", "KOI8-R", "KOI8-U", "Shift_JIS", "US-ASCII", "UTF-8", "UTF-16BE", "UTF-16LE", "UTF-32BE", "UTF-32LE", "Windows-1251", "GB2312", "IBM437", "IBM737", "IBM775", "CP850", "IBM852", "CP852", "IBM855", "CP855", "IBM857", "IBM860", "IBM861", "IBM862", "IBM863", "IBM864", "IBM865", "IBM866", "IBM869", "Windows-1258", "GB1988", "macCentEuro", "macCroatian", "macCyrillic", "macGreek", "macIceland", "macRoman", "macRomania", "macThai", "macTurkish", "macUkraine", "CP950", "CP951", "stateless-ISO-2022-JP", "eucJP-ms", "CP51932", "GB12345", "ISO-2022-JP", "ISO-2022-JP-2", "CP50220", "CP50221", "Windows-1252", "Windows-1250", "Windows-1256", "Windows-1253", "Windows-1255", "Windows-1254", "TIS-620", "Windows-874", "Windows-1257", "Windows-31J", "MacJapanese", "UTF-7", "UTF8-MAC", "UTF-16", "UTF-32", "UTF8-DoCoMo", "SJIS-DoCoMo", "UTF8-KDDI", "SJIS-KDDI", "ISO-2022-JP-KDDI", "stateless-ISO-2022-JP-KDDI", "UTF8-SoftBank", "SJIS-SoftBank", "BINARY", "CP437", "CP737", "CP775", "IBM850", "CP857", "CP860", "CP861", "CP862", "CP863", "CP864", "CP865", "CP866", "CP869", "CP1258", "Big5-HKSCS:2008", "eucJP", "euc-jp-ms", "eucKR", "eucTW", "EUC-CN", "eucCN", "CP936", "ISO2022-JP", "ISO2022-JP2", "ISO8859-1", "CP1252", "ISO8859-2", "CP1250", "ISO8859-3", "ISO8859-4", "ISO8859-5", "ISO8859-6", "CP1256", "ISO8859-7", "CP1253", "ISO8859-8", "CP1255", "ISO8859-9", "CP1254", "ISO8859-10", "ISO8859-11", "CP874", "ISO8859-13", "CP1257", "ISO8859-14", "ISO8859-15", "ISO8859-16", "CP878", "CP932", "csWindows31J", "SJIS", "PCK", "MacJapan", "ASCII", "ANSI_X3.4-1968", "646", "CP65000", "CP65001", "UTF-8-MAC", "UTF-8-HFS", "UCS-2BE", "UCS-4BE", "UCS-4LE", "CP1251", "external", "locale"] (optional), default: "UTF-8"
-
multiline_tag => ... # string (optional), default: "multiline"
-
negate => ... # boolean (optional), default: false
-
pattern => ... # string (required)
-
patterns_dir => ... # array (optional), default: []
-
what => ... # string, one of ["previous", "next"] (required)
-
}
-
}
-
}
negate字段是一个选择开关,可以正向匹配和反向匹配
参考:
参考:
拷贝@timestamp字段:
-
filter {
-
ruby {
-
code => "event['read_time'] = event['@timestamp']"
-
}
-
mutate
-
{
-
add_field => ["read_time_string", "%{@timestamp}"]
-
}
-
}
参考:
多行匹配:
在和 codec/multiline 搭配使用的时候,需要注意一个问题,grok 正则和普通正则一样,默认是不支持匹配回车换行的。就像你需要 =~ //m 一样也需要单独指定,具体写法是在表达式开始位置加 (?m) 标记。如下所示:
match => {
"message" => "(?m)\s+(?\d+(?:\.\d+)?)\s+"
}
此段原文来自:
最终的配置文件:
-
input {
-
file {
-
type => "type"
-
path => ["info.log"]
-
exclude => ["*.gz", "access.log"]
-
codec => multiline {
-
pattern => "^2015"
-
negate => true
-
what => "previous"
-
}
-
}
-
}
-
-
filter {
-
grok {
-
match => {
-
"message" => "(?m)%{TIMESTAMP_ISO8601:logtime}"
-
}
-
}
-
ruby {
-
code => "event['readtime'] = event['@timestamp']"
-
}
-
date {
-
#locale => "en"
-
match => ["logtime", "YYYY-MM-dd HH:mm:ss"]
-
#timezone => "UTC"
-
#target => "logtimestamp"
-
remove_field => [ "logtime"]
-
}
-
}
-
-
output {
-
stdout {}
-
redis {
-
host => "127.0.0.1"
-
port => 6379
-
data_type => "list"
-
key => "key_count"
-
}
-
}
grok内置正则表达式:
阅读(17104) | 评论(1) | 转发(0) |