谁来帮帮我看看这两个语句的性能差别怎么大
发布于 5 年前 作者 sdddd 1902 次浏览 来自 问答

#最慢的版本 match (user:User)-[:WILL_READ]->(book:Book)<-[:HAS_READ]-(user) return book.title, count(user) where 0<=user.id<100

#稍微快一点的版本 match (user:User)-[:WILL_READ]->(book:Book) (book2:Book)<-[:HAS_READ]-(user) return book.title, count(user) where 0<=user.id<100 and book = book2;

百思不得姐呀。

Note:库里的User和Book结点都是百万级别的,所以要分片执行,要是没有id的限制,直接卡死

9 回复

在每句查询前加个EXPLAIN,把结果发上来看看。

语句: profile match (user:User)-[:WILL_READ]->(book:Book), (book2:Book)<-[:HAS_READ]-(user) where user.id>0 and user.id < 20 and book=book2 return book.title, user.id; 用时信息:Cypher version: CYPHER 3.5, planner: COST, runtime: INTERPRETED. 27680 total db hits in 1940 ms. profile: 屏幕快照 2019-02-27 上午9.33.04.png

耗时的版本在browser里压根就没法运行出来。转shell运行: profile match (user:User)-[:WILL_READ]->(book:Book)<-[:HAS_READ]-(user) where user.id>0 and user.id < 20 return book.title, user.id; ±-------------------------------------------------------------------------------------------+ | Plan | Statement | Version | Planner | Runtime | Time | DbHits | Rows | ±-------------------------------------------------------------------------------------------+ | “PROFILE” | “READ_ONLY” | “CYPHER 3.5” | “COST” | “INTERPRETED” | 1624493 | 0 | 18 | ±-------------------------------------------------------------------------------------------+

±----------------------±---------------±-----±--------±----------±----------------------------------------------------±------------±-----------------------------------------------------------+ | Operator | Estimated Rows | Rows | DB Hits | Cache H/M | Identifiers | Ordered by | Other | ±----------------------±---------------±-----±--------±----------±----------------------------------------------------±------------±-----------------------------------------------------------+ | +ProduceResults | 13 | 18 | 0 | 0/0 | book, user.id, book.title, anon[44], anon[18], user | user.id ASC | 0.0 | | | ±---------------±-----±--------±----------±----------------------------------------------------±------------±-----------------------------------------------------------+ | +Projection | 13 | 18 | 18 | 0/0 | book, user.id, book.title, anon[44], anon[18], user | user.id ASC | 0.0; {book.title : book.title, user.id : cached[user.id]} | | | ±---------------±-----±--------±----------±----------------------------------------------------±------------±-----------------------------------------------------------+ | +Expand(Into) | 13 | 18 | 712393 | 0/0 | cached[user.id], book, anon[44], anon[18], user | user.id ASC | 0.0; (book)<-[anon[44]:HAS_READ]-(user) | | | ±---------------±-----±--------±----------±----------------------------------------------------±------------±-----------------------------------------------------------+ | +Filter | 1051883 | 3852 | 3852 | 0/0 | anon[18], book, cached[user.id], user | user.id ASC | 0.0; book:Book | | | ±---------------±-----±--------±----------±----------------------------------------------------±------------±-----------------------------------------------------------+ | +Expand(All) | 1051883 | 3852 | 3871 | 0/0 | anon[18], book, cached[user.id], user | user.id ASC | 0.0; (user)-[anon[18]:WILL_READ]->(book) | | | ±---------------±-----±--------±----------±----------------------------------------------------±------------±-----------------------------------------------------------+ | +NodeIndexSeekByRange | 37210 | 19 | 20 | 0/0 | cached[user.id], user | user.id ASC | 0.0; :User(id) > { AUTOINT0} AND :User(id) < { AUTOINT1} | ±----------------------±---------------±-----±--------±----------±----------------------------------------------------±------------±-----------------------------------------------------------+

±------------------------------+ | book.title | user.id | ±------------------------------+ 略去结果…… ±------------------------------+

18 rows available after 1 ms, consumed after another 1624492 ms

麻烦大佬们帮忙看看。谢谢了!

explain match (user:User)-[:WILL_READ]->(book:Book), (book2:Book)<-[:HAS_READ]-(user) where user.id>0 and user.id < 20 and book=book2 return book.title, user.id; 屏幕快照 2019-02-27 上午10.31.15.png

explain match (user:User)-[:WILL_READ]->(book:Book)<-[:HAS_READ]-(user) where user.id>0 and user.id < 20 return book.title, user.id; 屏幕快照 2019-02-27 上午10.32.01.png

最让人费解的是,貌似长时间的版本更简单啊,但运行结果偏偏是时间多得离谱。

你是要找同一个User,既有WILL_READ也有HAS_READ到同一本书的关系?

match (user:User)-[:HAS_READ]->(book:Book) where (user)-[:WILL_READ]->(book) and 0<=user.id<100 return book.title, count(user)

这样试试呢?感觉之前的cypher没有很好的吧之间的变量继承下来

@graphway 是的,是找一个用户,他之前说过要读什么书,后来也的确读完了那些书的人。 neo4j的性能貌似在数据量大一丢丢的时候就撑不住了。处于放弃它的边缘。

@zhoujieren64 谢谢大佬。试过了,没什么变化,依然非常非常慢。

@sdddd 有时候性能和你怎么写查询关系很大。试试下面的(我没测试过,可能有语法错误):

explain match (user:User)-[:WILL_READ]->(book:Book) where user.id>0 and user.id < 20 WITH DISTINCT user, book WHERE size ((user) -[:HAS_READ]-> (book)) > 0 return book.title, user.id;

@sdddd 8楼老师的cypher效率应该要比我上面那个要好很多

回到顶部