分析 Java 堆转储
本文的目的在于帮助您使用 Eclipse MAT 分析获取的堆转储。内容涵盖如何解析大型堆文件以及需要关注的要点。
当出现 OutOfMemory 异常时,如果在 neo4j.conf 中设置了以下参数,将生成 .hprof 文件。
dbms.jvm.additional=-XX:+HeapDumpOnOutOfMemoryError
| 您也可以调整以下设置来指定目录路径,但请确保在出现此类错误时有足够的磁盘空间。 |
dbms.jvm.additional=-XX:HeapDumpPath=/var/tmp/dumps
dbms.jvm.additional=-XX:OnOutOfMemoryError="tar cvzf /var/tmp/dump.tar.gz /var/tmp/dump;split -b 1G /var/tmp/dump.tar.gz;"
此文件是运行在您系统上的 Java 进程堆部分的镜像。文件结构取决于您运行 Neo4j 所使用的 JVM 供应商。
Oracle JDK、Open JDK 会生成 hprof 文件,可使用多数常见工具进行分析。IBM 堆转储则需要使用 IBM Heap Analyzer 或其他专有工具进行解析。
在 MemoryAnalyzer.ini 中更改设置
在本地环境中
您需要为进程分配的内存量与堆转储文件大小相当。
例如:如果堆约为 15GB,则分配 17GB 内存。
对于大型堆转储(> 25G),请参见下一节。
Edit MemoryAnalyzer.ini (on macOS, it is located in /Applications/mat.app/Contents/Eclipse/MemoryAnalyzer.ini)
添加或更改设置
-Xms10G -Xmx25G
在远程机器上
最好将其上传到具有大量磁盘和内存的 AWS/GCP 等云实例上。如果选择 AWS,请使用 spot instance。
随后需要挂载 EBS storage,创建 250GB 卷并附加到 EC2 实例。格式化该卷并在 Amazon Linux 实例上挂载。
记录下 instanceid 和 storageid,以确保资源在使用后被正确回收。
如果堆约为 61GB,解析时需要两倍的磁盘空间。如下所示
$ du -ch java_pid19820*
116M java_pid19820.a2s.index
5.6G java_pid19820.domIn.index
17G java_pid19820.domOut.index
61G java_pid19820.hprof #original heap dump
256K java_pid19820.i2sv2.index
11G java_pid19820.idx.index
29G java_pid19820.inbound.index
197M java_pid19820.index
4.5G java_pid19820.o2c.index
12G java_pid19820.o2hprof.index
11G java_pid19820.o2ret.index
29G java_pid19820.outbound.index
988K java_pid19820.threads
68K java_pid19820_Component_Report_sel.zip
180G total
-
前置条件:安装 Java 并确保有 250GB 可用空间
-
Download MemoryAnalyzer tool for linux: 下载
-
将其解压到某个目录
-
Edit MemoryAnalyzer.ini to adjust both -Xms and -Xmx memory settings
-startup plugins/org.eclipse.equinox.launcher_1.5.0.v20180512-1130.jar --launcher.library plugins/org.eclipse.equinox.launcher.gtk.linux.x86_64_1.1.700.v20180518-1200 -vmargs -Xms30G -Xmx100G
在远程机器上解析文件
This step is optional if you run Eclipse MAT on your local machine and have enough resources. The index files will be created when opening the heapdump file if they are missing.
Run ./ParseHeapDump.sh heapdump.hprof
It is located in the folder mat of Eclipse Mat tar.gz installation file
将本地目录与远程目录同步
To speed up things, you can use rsync over ssh. The advantage is that you can recover if you have a crash and -z flag enables compression.
示例
# on the remote machine
$ mkdir ${REMOTE_DIR}/parsed_files
$ mv *.index ${REMOTE_DIR}/parsed_files/
# on your local machine
$ rsync -P -e "ssh -i ${PATH_TO_KEY}" ec2-user@${REMOTE_IP}:${REMOTE_DIR}/heapdump.zip .
$ rsync -Prz -e "ssh -i ${PATH_TO_KEY} ec2-user@${REMOTE_IP}:${REMOTE_DIR}/parsed_files/ .
打开 Eclipse MAT
To open the heapdump, go to File > Open Heap Dump (Not Acquire Heap Dump) and browse to your heapdump location.
No need to open an existing report, press cancel if you have a modal dialog.
In the Overview tab, left-click on the largest object(s)
Choose "list objects" > "with outgoing references".
It will open a new tab with the list of all the elements.
Expand the first level then expand everything at the second level.
Cypher 查询字符串
There are a lot of objects in a heap dump, no need to go through the Object[],byte[],Strings, etc.
You might want to filter for the class that contain PreParsed. Once found, list their outgoing references to cross check of the one that has the most instances. A new tab will open and you will be able to see the rawStatement of the Cypher queries.
检查线程转储
With thread dumps that has been taken before the heap dump
The garbage collector will not be able to collect the thread objects until the threading system also dereferences the object, which won’t happen if the thread is alive.
So if you have a large amount of memory in the heap, there should be a potentially long running thread associated to your large object.
To find it, look for the thread name in the thread dumps.
$ grep neo4j.BoltWorker-394 *
5913-tdump-201903291746.log:"neo4j.BoltWorker-394 [bolt]" #620 daemon prio=5 os_prio=0 tid=0x00007fb737619800 nid=0x8cec waiting on condition [0x00007fb38d00f000]
5913-tdump-201903291751.log:"neo4j.BoltWorker-394 [bolt] [/www.xxx.yyy.zzz:57570] " #620 daemon prio=5 os_prio=0 tid=0x00007fb737619800 nid=0x8cec runnable [0x00007fb38d00b000]
5913-tdump-201903291756.log:"neo4j.BoltWorker-394 [bolt] [/www.xxx.yyy.zzz:57570] " #620 daemon prio=5 os_prio=0 tid=0x00007fb737619800 nid=0x8cec runnable [0x00007fb38d00b000]
Note that the thread dumps are included in the heap dump. They are available in plain text in the file but you don’t have the STATE information in Eclipse Mat. You can have them with other tools such as VisualVM
$ head -10 java_pid19820.threads
Thread 0x7fd64b0e1610
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.addConditionWaiter()Ljava/util/concurrent/locks/AbstractQueuedSynchronizer$Node; (AbstractQueuedSynchronizer.java:1855)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(J)J (AbstractQueuedSynchronizer.java:2068)
at java.util.concurrent.LinkedBlockingQueue.poll(JLjava/util/concurrent/TimeUnit;)Ljava/lang/Object; (LinkedBlockingQueue.java:467)
at com.hazelcast.util.executor.CachedExecutorServiceDelegate$Worker.run()V (CachedExecutorServiceDelegate.java:210)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V (ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run()V (ThreadPoolExecutor.java:624)
at java.lang.Thread.run()V (Thread.java:748)
at com.hazelcast.util.executor.HazelcastManagedThread.executeRun()V (HazelcastManagedThread.java:76)
at com.hazelcast.util.executor.HazelcastManagedThread.run()V (HazelcastManagedThread.java:92)
此页面有帮助吗?