生成式 AI 助手过程

使用自然语言查询

此程序 apoc.ml.query 接收一个自然语言问题并返回该查询的结果。

它使用此处记录的 chat/completions API。

查询调用

CALL apoc.ml.query("What movies did Tom Hanks play in?") yield value, query
RETURN *

示例响应

+------------------------------------------------------------------------------------------------------------------------------+
| value                                 | query                                                                                |
+------------------------------------------------------------------------------------------------------------------------------+
| {m.title -> "You've Got Mail"}        | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "Apollo 13"}              | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "Joe Versus the Volcano"} | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "That Thing You Do"}      | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "Cloud Atlas"}            | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "The Da Vinci Code"}      | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "Sleepless in Seattle"}   | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "A League of Their Own"}  | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "The Green Mile"}         | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "Charlie Wilson's War"}   | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "Cast Away"}              | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "The Polar Express"}      | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
+------------------------------------------------------------------------------------------------------------------------------+
12 rows

表 1. 输入参数
名称 (name)	description（描述）
question	自然语言形式的问题
conf	可选的配置映射，请查看下一节

表 2. 配置映射
名称 (name)	description（描述）	强制性 (mandatory)
retries	API 调用失败时的重试次数	否，默认 `3`
retryWithError	如果为 true，当出现错误时，会在重试 API 时向请求体添加以下消息：{"role":"user", "content": "The previous Cypher Statement throws the following error, consider it to return the correct statement: `<errorMessage>`"}, {"role":"assistant", "content":"Cypher Statement (in backticks):"}	否，默认 `false`
apiKey	OpenAI API 密钥	在未定义 `apoc.openai.key` 的情况下使用
model	OpenAI 模型	否，默认 `gpt-4o`
sample	要跳过的节点数量，例如 1000 的样本将读取每第 1000 个节点。它作为参数传递给计算模式的 `apoc.meta.data` 程序	否，默认为随机数

表 3. 结果
名称 (name)	description（描述）
值	查询的结果
cypher	用于计算结果的查询语句

我们可以使用 additionalPrompts 配置来改进请求，例如添加模式的自然语言描述（例如 apoc.ml.schema 的输出）。由于 OpenAI 主要针对自然语言问题而非 Cypher 查询进行训练，使用此配置可以获得更好的结果。例如，给定 Northwind 数据集，我们可以执行

查询调用

CALL apoc.ml.schema({apiKey: $apiKey}) YIELD value
WITH value
CALL apoc.ml.query("Which 5 employees had sold the product 'Chocolade' and has the highest selling count of another product?
  Please returns the employee identificator, the other product name and the count orders of another product",
{
    retries: 8,
    retryWithError: true,
    apiKey: $apiKey,
    additionalPrompts: [
        {role: "system", content: "The human description of the schema is the following:\n" + value}
    ]
})
YIELD query, value RETURN query, value

结果类似于以下内容。

结果不是确定性的，并且每次重新执行查询时可能会发生变化

表 4. 结果
query	值
"cypher MATCH (p:Product {productName: 'Chocolade'})←[:CONTAINS]-(:Order)←[:SOLD]-(e:Employee) MATCH (e)-[:SOLD]→(o:Order)-[:CONTAINS]→(p2:Product) WITH e, p2, COUNT(DISTINCT o) AS orderCount ORDER BY orderCount DESC RETURN e.employeeID AS employeeID, p2.productName AS otherProduct, orderCount LIMIT 5 "	{ "otherProduct": "Gnocchi di nonna Alice", "employeeID": "4", "orderCount": 14 }
"cypher MATCH (p:Product {productName: 'Chocolade'})←[:CONTAINS]-(:Order)←[:SOLD]-(e:Employee) MATCH (e)-[:SOLD]→(o:Order)-[:CONTAINS]→(p2:Product) WITH e, p2, COUNT(DISTINCT o) AS orderCount ORDER BY orderCount DESC RETURN e.employeeID AS employeeID, p2.productName AS otherProduct, orderCount LIMIT 5 "	{ "otherProduct": "Pâté chinois", "employeeID": "4", "orderCount": 12 }
"cypher MATCH (p:Product {productName: 'Chocolade'})←[:CONTAINS]-(:Order)←[:SOLD]-(e:Employee) MATCH (e)-[:SOLD]→(o:Order)-[:CONTAINS]→(p2:Product) WITH e, p2, COUNT(DISTINCT o) AS orderCount ORDER BY orderCount DESC RETURN e.employeeID AS employeeID, p2.productName AS otherProduct, orderCount LIMIT 5 "	{ "otherProduct": "Gumbär Gummibärchen", "employeeID": "3", "orderCount": 12 }
"cypher MATCH (p:Product {productName: 'Chocolade'})←[:CONTAINS]-(:Order)←[:SOLD]-(e:Employee) MATCH (e)-[:SOLD]→(o:Order)-[:CONTAINS]→(p2:Product) WITH e, p2, COUNT(DISTINCT o) AS orderCount ORDER BY orderCount DESC RETURN e.employeeID AS employeeID, p2.productName AS otherProduct, orderCount LIMIT 5 "	{ "otherProduct": "Flotemysost", "employeeID": "1", "orderCount": 12 }
"cypher MATCH (p:Product {productName: 'Chocolade'})←[:CONTAINS]-(:Order)←[:SOLD]-(e:Employee) MATCH (e)-[:SOLD]→(o:Order)-[:CONTAINS]→(p2:Product) WITH e, p2, COUNT(DISTINCT o) AS orderCount ORDER BY orderCount DESC RETURN e.employeeID AS employeeID, p2.productName AS otherProduct, orderCount LIMIT 5 "	{ "otherProduct": "Pavlova", "employeeID": "1", "orderCount": 11 }

与不使用自然语言模式描述的程序相比，输出产生的幻觉更少，例如不同标签持有的属性和链接到其他实体的关系。

用自然语言描述图模型

此程序 apoc.ml.schema 返回底层数据集的自然语言描述。

它使用此处记录的 chat/completions API。

查询调用

CALL apoc.ml.schema() yield value
RETURN *

示例响应

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| value                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "The graph database schema represents a system where users can follow other users and review movies. Users (:Person) can either follow other users (:Person) or review movies (:Movie). The relationships allow users to express their preferences and opinions about movies. This schema can be compared to social media platforms where users can follow each other and leave reviews or ratings for movies they have watched. It can also be related to movie recommendation systems where user preferences and reviews play a crucial role in generating personalized recommendations." |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row

表 5. 输入参数
名称 (name)	description（描述）
conf	可选的配置映射，请查看下一节

表 6. 配置映射
名称 (name)	description（描述）	强制性 (mandatory)
apiKey	OpenAI API 密钥	在未定义 `apoc.openai.key` 的情况下使用
model	OpenAI 模型	否，默认 `gpt-4o`
sample	要跳过的节点数量，例如 1000 的样本将读取每第 1000 个节点。它作为参数传递给计算模式的 `apoc.meta.data` 程序	否，默认为随机数

表 7. 结果
名称 (name)	description（描述）
值	数据集的描述

从自然语言查询创建 Cypher 查询

此程序 apoc.ml.cypher 接收一个自然语言问题，并将其转换为若干个请求的 Cypher 查询。

它使用此处记录的 chat/completions API。

查询调用

CALL apoc.ml.cypher("Who are the actors which also directed a movie?", {count: 4}) yield cypher
RETURN *

示例响应

+----------------------------------------------------------------------------------------------------------------+
| query                                                                                                          |
+----------------------------------------------------------------------------------------------------------------+
| "
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person)
RETURN a.name as actor, d.name as director
" |
| "cypher
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(a)
RETURN a.name
"                               |
| "
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person)
RETURN a.name
"                              |
| "cypher
MATCH (a:Person)-[:ACTED_IN]->(:Movie)<-[:DIRECTED]-(a)
RETURN DISTINCT a.name
"                       |
+----------------------------------------------------------------------------------------------------------------+
4 rows

表 8. 输入参数
名称 (name)	description（描述）	强制性 (mandatory)
question	自然语言形式的问题	是
conf	可选的配置映射，请查看下一节	否

表 9. 配置映射
名称 (name)	description（描述）	强制性 (mandatory)
count	要检索的查询数量	否，默认 `1`
apiKey	OpenAI API 密钥	在未定义 `apoc.openai.key` 的情况下使用
model	OpenAI 模型	否，默认 `gpt-4o`
sample	要跳过的节点数量，例如 1000 的样本将读取每第 1000 个节点。它作为参数传递给计算模式的 `apoc.meta.data` 程序	否，默认为随机数

表 10. 结果
名称 (name)	description（描述）
值	数据集的描述

查询调用

CALL apoc.ml.schema({apiKey: $apiKey}) YIELD value
WITH value
CALL apoc.ml.cypher("Which 5 employees had sold the product 'Chocolade' and has the highest selling count of another product?
  Please returns the employee identificator, the other product name and the count orders of another product",
{
  count: 1,
  apiKey: $apiKey,
  additionalPrompts: [
    {role: "system", content: "The human description of the schema is the following:\n" + value}
  ]
})
YIELD value RETURN value

结果类似于以下内容。

结果不是确定性的，并且每次重新执行查询时可能会发生变化

表 11. 结果
值
MATCH (p:Product {productName: 'Chocolade'})←[:CONTAINS]-(o:Order)←[:SOLD]-(e:Employee) MATCH (e)-[:SOLD]→(o2:Order)-[:CONTAINS]→(p2:Product) WITH e, p2, COUNT(DISTINCT o2) AS ordersCnt ORDER BY ordersCnt DESC RETURN e.employeeID AS employeeID, p2.productName AS otherProduct, ordersCnt LIMIT 5

与不使用自然语言模式描述的程序相比，输出产生的幻觉更少，例如不同标签持有的属性和链接到其他实体的关系。

从 Cypher 查询创建自然语言查询解释

此程序 apoc.ml.fromCypher 接收一个自然语言问题，并将其转换为自然语言查询解释。

它使用此处记录的 chat/completions API。

查询调用

CALL apoc.ml.cypher("MATCH (p:Person {name: "Tom Hanks"})-[:ACTED_IN]->(m:Movie) RETURN m", {}) yield value
RETURN *

表 12. 示例响应
值
此数据库模式代表了一个通用电影数据库模型的简化版本。`movie` 节点代表一个电影实体，具有发行年份、标语和电影标题等属性。`person` 节点代表电影行业的相关人员，具有出生年份和姓名属性。`directed` 关系连接 `person` 节点和 `movie` 节点，表示该人执导了该电影。就领域而言，此模式与娱乐业，特别是电影业相关。电影和参与制作这些电影的人是该领域的基础实体。`directed` 关系捕捉了人与电影之间的导演关系。此模型可以扩展以包含其他关系，如 `acted_in`、`produced`、`wrote` 等，以捕捉电影行业内更复杂的连接。总的来说，此图数据库模式提供了电影领域实体和关系的简单而强大的表示，允许对行业内的连接进行查询和分析。

表 13. 输入参数
名称 (name)	description（描述）	强制性 (mandatory)
cypher	自然语言形式的问题	是
conf	可选的配置映射，请查看下一节	否

表 14. 配置映射
名称 (name)	description（描述）	强制性 (mandatory)
retries	API 调用失败时的重试次数	否，默认 `3`
apiKey	OpenAI API 密钥	在未定义 `apoc.openai.key` 的情况下使用
model	OpenAI 模型	否，默认 `gpt-4o`
sample	要跳过的节点数量，例如 1000 的样本将读取每第 1000 个节点。它作为参数传递给计算模式的 `apoc.meta.data` 程序	否，默认为随机数

表 15. 结果
名称 (name)	description（描述）
值	数据集的描述

从一组查询创建子图解释

此程序 apoc.ml.fromQueries 返回给定查询集的自然语言解释。

它使用此处记录的 chat/completions API。

查询调用

CALL apoc.ml.fromQueries(['MATCH (n:Movie) RETURN n', 'MATCH (n:Person) RETURN n'],
    {apiKey: <apiKey>})
YIELD value
RETURN *

示例响应

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| value                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "The database represents movies and people, like in a movie database or social network.
    There are no defined relationships between nodes, allowing flexibility for future connections.
    The Movie node includes properties like title, tagline, and release year." |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row

带有路径的查询调用

CALL apoc.ml.fromQueries(['MATCH (n:Movie) RETURN n', 'MATCH p=(n:Movie)--() RETURN p'],
    {apiKey: <apiKey>})
YIELD value
RETURN *

示例响应

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| value                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "models relationships in the movie industry, connecting :Person nodes to :Movie nodes.
    It represents actors, directors, writers, producers, and reviewers connected to movies they are involved with.
    Similar to a social network graph but specialized for the entertainment industry.
    Each relationship type corresponds to common roles in movie production and reviewing.
    Allows for querying and analyzing connections and collaborations within the movie business." |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row

表 16. 输入参数
名称 (name)	description（描述）
queries	查询列表
conf	可选的配置映射，请查看下一节

表 17. 配置映射
名称 (name)	description（描述）	强制性 (mandatory)
apiKey	OpenAI API 密钥	在未定义 `apoc.openai.key` 的情况下使用
model	OpenAI 模型	否，默认 `gpt-4o`
sample	要跳过的节点数量，例如 1000 的样本将读取每第 1000 个节点。它作为参数传递给计算模式的 `apoc.meta.data` 程序	否，默认为随机数

表 18. 结果
名称 (name)	description（描述）
值	数据集的描述