知识库

debug.log 中关于“最后检查点之后发现的提交 (Commits found after last checkpoint)”消息的解释

例如在运行备份时

$ bin/neo4j-admin backup --backup-dir=/home/neo4j/backups/01172020/ --name=graph.db

你可能会在该命令的输出中看到类似以下细节

2020-01-17 12:54:06.767+0000 INFO [o.n.b.i.BackupOutputMonitor] Finish receiving store file /home/neo4j/backups/backups/01172020/graph.db/temp-copy/schema/index/lucene_native-2.0/5/native-1.0/profiles/index-5.708254.cacheprof
2020-01-17 12:54:06.767+0000 INFO [o.n.b.i.BackupOutputMonitor] Start receiving store file /home/neo4j/backups/backups/01172020/graph.db/temp-copy/schema/index/lucene_native-2.0/5/string-1.0/profiles/index-5.708254.cacheprof
2020-01-17 12:54:06.767+0000 INFO [o.n.b.i.BackupOutputMonitor] Finish receiving store file /home/neo4j/backups/backups/01172020/graph.db/temp-copy/schema/index/lucene_native-2.0/5/string-1.0/profiles/index-5.708254.cacheprof
2020-01-17 12:54:06.767+0000 INFO [o.n.b.i.BackupOutputMonitor] Start receiving store file /home/neo4j/backups/backups/01172020/graph.db/temp-copy/neostore
2020-01-17 12:54:06.767+0000 INFO [o.n.b.i.BackupOutputMonitor] Finish receiving store file /home/neo4j/backups/backups/01172020/graph.db/temp-copy/neostore
2020-01-17 12:54:06.769+0000 INFO [o.n.b.i.BackupOutputMonitor] Finish receiving store files, took 9s 552ms
2020-01-17 12:54:06.869+0000 INFO [o.n.b.i.BackupOutputMonitor] Start recovering store
2020-01-17 12:59:34.411+0000 INFO [o.n.b.i.BackupOutputMonitor] Finish recovering store, took 5m 27s 542ms
2020-01-17 12:59:37.370+0000 INFO [o.n.b.i.BackupOutputMonitor] Finished, took 5m 40s 157m

并且会发现从 Start Recovering storeFinished recovering store 之间出现长时间的暂停(例如 5 分钟以上)。

需要说明的是,在屏幕输出的同时,我们还会将日志写入备份目标路径下的 debug.log.<epochtime>(本例中为 /home/neo4j/backups/01172020/),你会看到类似以下的消息

2020-01-17 12:54:06.869+0000 INFO [o.n.k.NeoStoreDataSource] Commits found after last check point (which is at LogPosition{logVersion=26059, byteOffset=219081898}). First txId after last checkpoint: 643034051
2020-01-17 12:54:06.929+0000 INFO [o.n.k.NeoStoreDataSource] Recovery required from position LogPosition{logVersion=26059, byteOffset=219081898}
2020-01-17 12:54:34.411+0000 INFO [o.n.k.r.Recovery]   10% completed
2020-01-17 12:54:35.232+0000 INFO [o.n.k.r.Recovery]   20% completed
2020-01-17 12:54:35.338+0000 INFO [o.n.k.r.Recovery]   30% completed
2020-01-17 12:56:02.151+0000 INFO [o.n.k.r.Recovery]   40% completed
2020-01-17 12:56:21.642+0000 INFO [o.n.k.r.Recovery]   50% completed
2020-01-17 12:56:49.419+0000 INFO [o.n.k.r.Recovery]   60% completed
2020-01-17 12:58:06.401+0000 INFO [o.n.k.r.Recovery]   70% completed
2020-01-17 12:58:14.209+0000 INFO [o.n.k.r.Recovery]   80% completed
2020-01-17 12:58:48.040+0000 INFO [o.n.k.r.Recovery]   90% completed
2020-01-17 12:59:34.411+0000 INFO [o.n.k.r.Recovery]   100% completed
2020-01-17 12:59:34.411+0000 INFO [o.n.k.NeoStoreDataSource] Recovery completed. 215589 transactions, first:643034051, last:643249639 recovered

产生这些消息以及延迟的原因在于备份的执行方式以及其检查点的工作方式。备份启动后首先会执行一次数据库检查点,该检查点会将所有 pagecache 数据刷新到存储文件(即 graph.db/neostore.*.db)中。随后备份会把运行中的 data/databases/graph.db 文件复制到备份目标位置。当所有文件复制完成后,备份会回溯查看自最初复制文件后是否有新的事务产生;如果有事务,则会执行上文所述的“恢复”。例如,备份在 09:00 开始,09:00‑09:04 期间将 data/databases/graph.db 的所有文件复制到备份目标,在这 4 分钟内有 10,000 条新事务提交到数据库,那么备份会进行一次“恢复”,以捕获这额外的 10,000 条事务,确保它们也写入备份目标。

© . This site is unofficial and not affiliated with Neo4j, Inc.