解析 query.log 的一种方法
当通过 Neo4j Enterprise 参数 dbms.logs.query.enabled 启用了 query.log 时,随附的 bash 脚本可用于快速解析日志,并根据总执行时间识别出前 10 条最耗时的查询;如果同时启用了 dbms.logs.query.time_logging_enabled,则还会报告基于规划时间、CPU 时间和等待时间的前 10 条最耗时查询。
bash 脚本如下
#!/bin/bash
# Copyright (c) 2002-2018 "Neo Technology,"
# Network Engine for Objects in Lund AB [http://neotechnology.com]
# This file is a commercial add-on to Neo4j Enterprise Edition.
# only parse those lines where the 3rd field is 'INFO'
# count total number of lines in query.log which report INFO in the 3rd field
parse() {
logfile=$1
if [ -e $logfile ]; then
# log file exits
awk '$3== "INFO" {print "First Query Reported at: " $1 " " $2}' $logfile | head -1
awk '$3== "INFO" {print " Last Query Reported at: " $1 " " $2}' $logfile | tail -1
awk '$3== "INFO" {count++; total=total+$4 } END{print "\n*******EXECUTION:******* \nTotal # of Completed Queries: " count "\n Total Duration (msec): " total "\n Avg of all Queries (msec): " total/count "\n" }' $logfile
# print longest query
printf "Top 10 Longest Queries longest on top and leading line number: \n"
printf "Note: Queries which are multi-line will only report first line of query !!!\n\n\n"
awk '$3== "INFO" {print "line:" NR "\t" $0}' $logfile | sort -n -r -k 5 | head -10
pcw=`grep "INFO" $logfile | grep "(planning:" | head -1`
if [ ! -z "$pcw" ]; then
# only enter this block if conf/neo4j.conf has
# dbms.logs.query.time_logging_enabled=true
# which thus causes field 6 to appear as `(planning:...........`
#
# produce metrics on planning, CPU and waiting
#PLANING
awk '$3== "INFO" && $6== "(planning:" {count++; total=total+$7 } $7== "0," {planzero++;} END{print "\n\t\t*******PLANNING:******* \n Total # of Completed Queries: " count "\n Total Duration (msec): " total "\n Avg of all Queries (msec): " total/count "\nNumber of Queries NOT Planned: " planzero " " (planzero/count)*100 "%\n" }' $logfile
# print longest query
awk '$3== "INFO" {print "line:" NR "\t" $0}' $logfile | sort -n -r -k 8 | head -10
#CPU
awk '$3== "INFO" && $8== "cpu:" {count++; total=total+$9 } END{print "\n\t\t*******CPU:******* \nTotal # of Completed Queries: " count "\n Total Duration (msec): " total "\n Avg of all Queries (msec): " total/count "\n" }' $logfile
awk '$3== "INFO" {print "line:" NR "\t" $0}' $logfile | sort -n -r -k 10 | head -10
#WAITING
awk '$3== "INFO" &&$10== "waiting:" {count++; total=total+$11 } $11== "0)" {nowait++;} END{print "\n\t\t\t*******WAITING:******* \n Total # of Completed Queries: " count "\n Total Duration (msec): " total "\n Avg of all Queries (msec): " total/count "\nNumber of Queries NOT Waiting: " nowait " " (nowait/count)*100 "%\n" }' $logfile
awk '$3== "INFO" {print "line:" NR "\t" $0}' $logfile | sort -n -r -k 12 | head -10
fi
else
# logfile does not exist
printf "\n$logfile does not exist\n"
printf "Usage: $0 <filename>\n"
fi
}
if [ -z $1 ]; then
parse "query.log"
else
parse $1
fi
运行脚本并传入要分析的 query.log 文件,例如 ./parse_querylog.sh query.log,输出将类似于
-
当
dbms.logs.query.time_logging_enabled=false时First Query Reported at: 2018-04-10 11:48:36.425+0000 Last Query Reported at: 2018-04-10 11:49:01.213+0000 *******EXECUTION:******* Total # of Completed Queries: 6 Total Duration (msec): 2521 Avg of all Queries (msec): 420.167 Top 10 Longest Queries longest on top and leading line number: Note: Queries which are multi-line will only report first line of query !!! line:1 2018-04-10 11:48:36.425+0000 INFO 2398 ms: embedded-session - MATCH (a:` Arbitrary label name that really doesn't matter `) RETURN a LIMIT 0 - {} - {} line:2 2018-04-10 11:48:37.106+0000 INFO 48 ms: bolt-session bolt neo4j-java/dev client/127.0.0.1:35796 server/127.0.0.1:7687> - RETURN 1 - {} - {} line:3 2018-04-10 11:48:43.839+0000 INFO 33 ms: bolt-session bolt neo4j-java/dev client/127.0.0.1:35796 server/127.0.0.1:7687> - match (n) return count(n); - {} - {} line:9 2018-04-10 11:49:01.213+0000 INFO 21 ms: bolt-session bolt neo4j-java/dev client/127.0.0.1:35796 server/127.0.0.1:7687> - line:4 2018-04-10 11:48:47.494+0000 INFO 12 ms: bolt-session bolt neo4j-java/dev client/127.0.0.1:35796 server/127.0.0.1:7687> - line:6 2018-04-10 11:48:51.613+0000 INFO 9 ms: bolt-session bolt neo4j-java/dev client/127.0.0.1:35796 server/127.0.0.1:7687> - -
当
dbms.logs.query.time_logging_enabled=true时(注意:为简洁起见,每个章节中 10 条查询的列表已缩减为 2 行。此外,查询的 Cypher 语句也已简化)
First Query Reported at: 2018-01-24 07:58:13.360+0000 Last Query Reported at: 2018-01-24 15:19:40.897+0000 *******EXECUTION:******* Total # of Completed Queries: 13655 Total Duration (msec): 1207191 Avg of all Queries (msec): 88.4065 Top 10 Longest Queries longest on top and leading line number: Note: Queries which are multi-line will only report first line of query !!! line:12100 2018-01-24 08:49:14.144+0000 INFO 10082 ms: (planning: 0, cpu: 81, waiting: 0) - bolt-session bolt neo4j neo4j-java/1.4.2-45c2930bc28fac23dda088b300977b804fedf8bb client/10.21.1.3:56870 server/10.0.1.2:7687> neo4j - MATCH (n:Person............) line:58883 2018-01-24 12:28:17.472+0000 INFO 2530 ms: (planning: 313, cpu: 2520, waiting: 0) - bolt-session bolt neo4j neo4j-java/1.4.2-45c2930bc28fac23dda088b300977b804fedf8bb client/10.21.21.27:56870 server/10.0.1.2:7687> neo4j - MATCH (n:Person) *******PLANNING:******* Total # of Completed Queries: 13655 Total Duration (msec): 8798 Avg of all Queries (msec): 0.644306 Number of Queries NOT Planned: 11448 83.8374% line:37438 2018-01-24 10:43:28.532+0000 INFO 2255 ms: (planning: 991, cpu: 2239, waiting: 0) - bolt-session bolt neo4j neo4j-java/1.4.2-45c2930bc28fac23dda088b300977b804fedf8bb client/10.21.21.45:56870 server/10.0.1.2:7687> neo4j - MATCH (n:Person...............) line:89641 2018-01-24 15:13:42.191+0000 INFO 513 ms: (planning: 409, cpu: 510, waiting: 0) - bolt-session bolt neo4j neo4j-java/1.4.2-45c2930bc28fac23dda088b300977b804fedf8bb client/10.21.21.45:56870 server/10.0.1.2:7687> neo4j - MATCH (n:Person...........) {} *******CPU:******* Total # of Completed Queries: 13655 Total Duration (msec): 1194341 Avg of all Queries (msec): 87.4655 line:58883 2018-01-24 12:28:17.472+0000 INFO 2530 ms: (planning: 313, cpu: 2520, waiting: 0) - bolt-session bolt neo4j neo4j-java/1.4.2-45c2930bc28fac23dda088b300977b804fedf8bb client/10.21.21.27:56870 server/10.0.1.2:7687> neo4j - MATCH (n:Person .......) line:386 2018-01-24 07:59:54.851+0000 INFO 2359 ms: (planning: 256, cpu: 2350, waiting: 0) - bolt-session bolt neo4j neo4j-java/1.4.2-45c2930bc28fac23dda088b300977b804fedf8bb client/10.21.21.27:49536 server/10.0.1.2:7687> neo4j - MATCH (n:Person {.......}) *******WAITING:******* Total # of Completed Queries: 13655 Total Duration (msec): 27 Avg of all Queries (msec): 0.0019773 Number of Queries NOT Waiting: 13654 99.9927% line:81 2018-01-24 07:58:33.168+0000 INFO 67 ms: (planning: 0, cpu: 67, waiting: 27) - bolt-session bolt neo4j neo4j-java/1.4.2-45c2930bc28fac23dda088b300977b804fedf8bb client/10.21.21.27:49536 server/10.0.1.2:7687> neo4j - MATCH (n:Person) return n:Deleted line:9991 2018-01-24 08:40:16.476+0000 INFO 74 ms: (planning: 0, cpu: 74, waiting: 0) - bolt-session bolt neo4j neo4j-java/1.4.2-45c2930bc28fac23dda088b300977b804fedf8bb client/10.21.21.27:49536 server/10.0.1.2:7687> neo4j - MATCH (n:Person..............)
从第二次运行的输出可以看到日志跨度为 First Query Reported at: 2018-01-24 07:58:13.360+0000 到 Last Query Reported at: 2018-01-24 15:19:40.897+0000。脚本随后报告我们找到了 Total # of Completed Queries: 13655,并提供了基于总/平均执行时间、总/平均规划时间、总/平均 CPU 时间以及总/平均等待时间的前 10 条最耗时查询列表。此外,从输出可以看到“查询规划”并不是一个重要问题,因为 Number of Queries NOT Planned: 11448 83.8374% 表明 83.8374% 的查询是从查询计划缓存中获取的。最后,“锁定”也不是问题,因为 Number of Queries NOT Waiting: 13654 99.9927% 表示 99.9927% 的查询没有花费时间在“等待”状态。
| 脚本有一些‘注意事项’。 |
-
上述脚本会报告日志文件中满足任意‘前 10’结果的查询所在的行号。然而,如果查询是多行的(包括换行符),则仅显示查询的第一行。需要在相应的行号处读取 query.log 才能完整了解该查询。
-
如果在 Neo4j 刚刚冷启动后立即对 query.log 进行分析,报告的时间可能会被夸大。这是因为查询未命中查询计划缓存或页面缓存所致。
-
可以通过参数
dbms.logs.query.threshold配置 query.log 只记录运行时间超过 X 的查询。如果将其设为 0,则会记录所有查询;如果设为 2s,则仅记录运行时间超过 2 秒的查询,这会影响上述结果。
此页面有帮助吗?