使用 Neo4j-admin copy 对 4.0 进行数据库压缩
本文演示如何使用 neo4j‑admin copy 工具回收 Neo4j 存储文件中未使用的空间。
1). 添加 10 万个节点:foreach (x in range (1,100000) | create (n:testnode1 {id:x}))。
2). 检查已分配的 ID 范围:MATCH (n:testnode1) RETURN ID(n) as ID order by ID limit 5。
-
ID 递增:0、1、2、3、4;ID 递减:99999、99998、99997、99996、99995。
3). 执行 :sysinfo:,得到 Total Store Size=18.6 MiB,ID Allocation: Node ID 100000,Property ID 100000。
4). 然后可以通过 Match (n) detach delete n 删除上述创建的节点。
5). 再次查看 :sysinfo:,仍显示 Total Store Size=18.6 MiB,ID Allocation: Node ID 100000,Property ID 100000。
6). 我们可以执行完整的 neo4j‑admin 备份(/docs/operations-manual/current/backup-restore/online-backup/),该在线备份默认会执行一次检查点(将 pagecache 中的缓存更新写入存储文件)。
7). 从第 6 步可以看到,删除节点后已分配的 ID 未改变,且存储大小也没有缩小。如果在这种情况下,或者在生产数据库中频繁进行大量写入/删除,导致存储文件中出现大量未使用空间,我们可以使用 4.0 引入的 neo4j-admin copy 工具(本质上是 store‑utils 的合并版)/docs/operations-manual/current/tools/neo4j-admin/#neo4j-admin-syntax-and-commands。可以把第 6 步产生的备份作为输入来运行 neo4j‑admin copy。需注意,neo4j‑admin copy 只能在 **离线数据库或备份** 上执行。
8). 例如这样执行 neo4j-admin copy:
$./bin/neo4j-admin copy --from-database=neo4j --to-database=1/backups/copy:
Starting to copy store, output will be saved to: /$neo4j_home/logs/neo4j-admin-copy-2020-01-16.12.06.38.log
2020-01-16 12:06:38.777+0000 INFO [StoreCopy] ### Copy Data ###
2020-01-16 12:06:38.778+0000 INFO [StoreCopy] Source: /Users/um/neo4j/4.0/cc/1/data/databases/neo4j
2020-01-16 12:06:38.778+0000 INFO [StoreCopy] Target: /Users/um/neo4j/4.0/cc/1/data/databases/1/backups/copy
2020-01-16 12:06:38.779+0000 INFO [StoreCopy] Empty database created, will start importing readable data from the source.
2020-01-16 12:06:40.159+0000 INFO [o.n.i.b.ImportLogic] Import starting
Import starting 2020-01-16 12:06:40.227+0000
Estimated number of nodes: 0.00
Estimated number of node properties: 0.00
Estimated number of relationships: 0.00
Estimated number of relationship properties: 0.00
Estimated disk space usage: 3.922MiB
Estimated required memory usage: 7.969MiB
(1/4) Node import 2020-01-16 12:06:40.604+0000
Estimated number of nodes: 0.00
Estimated disk space usage: 1.961MiB
Estimated required memory usage: 7.969MiB
(2/4) Relationship import 2020-01-16 12:06:42.804+0000
Estimated number of relationships: 0.00
Estimated disk space usage: 1.961MiB
Estimated required memory usage: 7.969MiB
(3/4) Relationship linking 2020-01-16 12:06:43.046+0000
Estimated required memory usage: 7.969MiB
(4/4) Post processing 2020-01-16 12:06:43.461+0000
Estimated required memory usage: 7.969MiB
-......... .......... .......... .......... .......... 5% ∆226ms
.......... .......... .......... .......... .......... 10% ∆1ms
.......... .......... .......... .......... .......... 15% ∆1ms
.......... .......... .......... .......... .......... 20% ∆1ms
.......... .......... .......... .......... .......... 25% ∆0ms
.......... .......... .......... .......... .......... 30% ∆1ms
.......... .......... .......... .......... .......... 35% ∆0ms
.......... .......... .......... .......... .......... 40% ∆1ms
.......... .......... .......... .......... .......... 45% ∆0ms
.......... .......... .......... .......... .......... 50% ∆1ms
.......... .......... .......... .......... .......... 55% ∆0ms
.......... .......... .......... .......... .......... 60% ∆0ms
.......... .......... .......... .......... .......... 65% ∆1ms
.......... .......... .......... .......... .......... 70% ∆0ms
.......... .......... .......... .......... .......... 75% ∆1ms
.......... .......... .......... .......... .......... 80% ∆0ms
.......... .......... .......... .......... .......... 85% ∆0ms
.......... .......... .......... .......... .......... 90% ∆1ms
.......... .......... .......... .......... .......... 95% ∆0ms
.......... .......... .......... .......... .......... 100% ∆1ms
IMPORT DONE in 3s 860ms.
Imported:
0 nodes
0 relationships
0 properties
Peak memory usage: 7.969MiB
2020-01-16 12:06:44.031+0000 INFO [o.n.i.b.ImportLogic] Import completed successfully, took 3s 860ms. Imported:
0 nodes
0 relationships
0 properties
2020-01-16 12:06:44.318+0000 INFO [StoreCopy] Import summary: Copying of 200622 records took 5 seconds (40124 rec/s). Unused Records 200622 (100%) Removed Records 0 (0%)
2020-01-16 12:06:44.318+0000 INFO [StoreCopy] ### Extracting schema ###
2020-01-16 12:06:44.319+0000 INFO [StoreCopy] Trying to extract schema...
2020-01-16 12:06:44.330+0000 INFO [StoreCopy] ... found 0 schema definition. The following can be used to recreate the schema:
2020-01-16 12:06:44.332+0000 INFO [StoreCopy]
上述示例大约用了 6 秒完成,并得到一个压缩且一致的存储(任何不一致的节点、属性、关系都不会被复制到新建的存储中)。另外需要说明的是,上面的 “/copy” 实际创建在 $neo4j_home/data/databases/neo4j/1/backups/copy 目录下,而不是 /current-directory/1/backups/copy,因为 copy 工具会在指定的目标目录前自动加上 $neo4j_home/data/databases/<database_name> 前缀。
9). 接下来可以在一台独立的 Neo4j 4.0 实例上恢复上述拷贝,并将恢复后的存储大小与之前的 61.6 MiB 作比较:执行 ./sa/bin/neo4j-admin restore --from=cc/1/data/databases/1/backups/copy --verbose --database=sa/data/databases/neo4j --force
需要注意的是,恢复后的 neo4j 数据库会被放置在 $neo4j_home/data/databases/sa/data/databases,同样会在目标目录前加上 $neo4j_home/data/databases 前缀。
10). 最后,将压缩前后的总存储大小进行比较。
在本例中,恢复后的数据库通过 sysinfo 查看,总存储大小已变为 800.00 KiB。
这表明 neo4j‑admin copy 工具成功完成了存储压缩,操作系统也回收了 ID 空间为未来创建 ID 所预留的磁盘空间。
参考文献
此页面有帮助吗?