精华
neo4j实现PageRank算法
neo4j实现PageRank算法 该内容源于neo4j算法数据库GDS操作手册1.4版,只是记录下来方便自己理解,反思回顾。
例子:创建一个属性图
CREATE
(home:Page {name:'Home'}),
(about:Page {name:'About'}),
(product:Page {name:'Product'}),
(links:Page {name:'Links'}),
(a:Page {name:'Site A'}),
(b:Page {name:'Site B'}),
(c:Page {name:'Site C'}),
(d:Page {name:'Site D'}),
(home)-[:LINKS {weight: 0.2}]->(about),
(home)-[:LINKS {weight: 0.2}]->(links),
(home)-[:LINKS {weight: 0.6}]->(product),
(about)-[:LINKS {weight: 1.0}]->(home),
(product)-[:LINKS {weight: 1.0}]->(home),
(a)-[:LINKS {weight: 1.0}]->(home),
(b)-[:LINKS {weight: 1.0}]->(home),
(c)-[:LINKS {weight: 1.0}]->(home),
(d)-[:LINKS {weight: 1.0}]->(home),
(links)-[:LINKS {weight: 0.8}]->(home),
(links)-[:LINKS {weight: 0.05}]->(a),
(links)-[:LINKS {weight: 0.05}]->(b),
(links)-[:LINKS {weight: 0.05}]->(c),
(links)-[:LINKS {weight: 0.05}]->(d);
1:属性图如下 在这里插入图片描述 2:给这个属性图起一个名字
CALL gds.graph.create(
'myGraph',
'Page',
'LINKS',
{
relationshipProperties: 'weight'
}
)
3:可以评估一下这个属性图所占内存
CALL gds.pageRank.write.estimate('myGraph', {
writeProperty: 'pageRank',
maxIterations: 20,
dampingFactor: 0.85
})
YIELD nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
4:现在可以调用GDS中的PageRank算法
CALL gds.pageRank.stream('myGraph')
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC, name ASC
5:还有stats、mutate和write执行模式,此处不在赘述 6:下面执行带权重的算法计算
CALL gds.pageRank.stream('myGraph', {
maxIterations: 20,
dampingFactor: 0.85,
relationshipWeightProperty: 'weight'
})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC, name ASC
7:这里约束条件tolerance设为0.1,低于此值,算法结束,结果返回
CALL gds.pageRank.stream('myGraph', {
maxIterations: 20,
dampingFactor: 0.85,
tolerance: 0.1
})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC, name ASC
8:阻尼系数dampingFactor:也就是概率值改变会引起中心度值不同,这里变为0.05,看起来和默认的0.85差不多,源于图节点比较少,还不足以引起质变
CALL gds.pageRank.stream('myGraph', {
maxIterations: 20,
dampingFactor: 0.05
})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC, name ASC
9:个性化PageRank算法,其实就是制定一个特定的起点,重点关注这个指定点的中心度
MATCH (siteA:Page {name: 'Site A'})
CALL gds.pageRank.stream('myGraph', {
maxIterations: 20,
dampingFactor: 0.85,
sourceNodes: [siteA]
})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC, name ASC
相关链接 这篇文章关于PageRank算法的通俗理解 (https://segmentfault.com/a/1190000015409175)