生成式 AI 助手过程
使用自然语言查询
此程序 apoc.ml.query 接收一个自然语言问题并返回该查询的结果。
它使用 此处记录的 chat/completions API。
CALL apoc.ml.query("What movies did Tom Hanks play in?") yield value, query
RETURN *
+------------------------------------------------------------------------------------------------------------------------------+
| value | query |
+------------------------------------------------------------------------------------------------------------------------------+
| {m.title -> "You've Got Mail"} | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "Apollo 13"} | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "Joe Versus the Volcano"} | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "That Thing You Do"} | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "Cloud Atlas"} | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "The Da Vinci Code"} | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "Sleepless in Seattle"} | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "A League of Their Own"} | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "The Green Mile"} | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "Charlie Wilson's War"} | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "Cast Away"} | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
| {m.title -> "The Polar Express"} | "cypher
MATCH (m:Movie)<-[:ACTED_IN]-(p:Person {name: 'Tom Hanks'})
RETURN m.title
" |
+------------------------------------------------------------------------------------------------------------------------------+
12 rows
| 名称 (name) | description(描述) |
|---|---|
question |
自然语言形式的问题 |
conf |
可选的配置映射,请查看下一节 |
| 名称 (name) | description(描述) | 强制性 (mandatory) |
|---|---|---|
retries |
API 调用失败时的重试次数 |
否,默认 |
retryWithError |
如果为 true,当出现错误时,会在重试 API 时向请求体添加以下消息:{ |
否,默认 |
apiKey |
OpenAI API 密钥 |
在未定义 |
model |
OpenAI 模型 |
否,默认 |
sample |
要跳过的节点数量,例如 1000 的样本将读取每第 1000 个节点。它作为参数传递给计算模式的 |
否,默认为随机数 |
| 名称 (name) | description(描述) |
|---|---|
值 |
查询的结果 |
cypher |
用于计算结果的查询语句 |
我们可以使用 additionalPrompts 配置来改进请求,例如添加模式的自然语言描述(例如 apoc.ml.schema 的输出)。由于 OpenAI 主要针对自然语言问题而非 Cypher 查询进行训练,使用此配置可以获得更好的结果。例如,给定 Northwind 数据集,我们可以执行
CALL apoc.ml.schema({apiKey: $apiKey}) YIELD value
WITH value
CALL apoc.ml.query("Which 5 employees had sold the product 'Chocolade' and has the highest selling count of another product?
Please returns the employee identificator, the other product name and the count orders of another product",
{
retries: 8,
retryWithError: true,
apiKey: $apiKey,
additionalPrompts: [
{role: "system", content: "The human description of the schema is the following:\n" + value}
]
})
YIELD query, value RETURN query, value
结果类似于以下内容。
| 结果不是确定性的,并且每次重新执行查询时可能会发生变化 |
| query | 值 |
|---|---|
"cypher MATCH (p:Product {productName: 'Chocolade'})←[:CONTAINS]-(:Order)←[:SOLD]-(e:Employee) MATCH (e)-[:SOLD]→(o:Order)-[:CONTAINS]→(p2:Product) WITH e, p2, COUNT(DISTINCT o) AS orderCount ORDER BY orderCount DESC RETURN e.employeeID AS employeeID, p2.productName AS otherProduct, orderCount LIMIT 5 " |
{ "otherProduct": "Gnocchi di nonna Alice", "employeeID": "4", "orderCount": 14 } |
"cypher MATCH (p:Product {productName: 'Chocolade'})←[:CONTAINS]-(:Order)←[:SOLD]-(e:Employee) MATCH (e)-[:SOLD]→(o:Order)-[:CONTAINS]→(p2:Product) WITH e, p2, COUNT(DISTINCT o) AS orderCount ORDER BY orderCount DESC RETURN e.employeeID AS employeeID, p2.productName AS otherProduct, orderCount LIMIT 5 " |
{ "otherProduct": "Pâté chinois", "employeeID": "4", "orderCount": 12 } |
"cypher MATCH (p:Product {productName: 'Chocolade'})←[:CONTAINS]-(:Order)←[:SOLD]-(e:Employee) MATCH (e)-[:SOLD]→(o:Order)-[:CONTAINS]→(p2:Product) WITH e, p2, COUNT(DISTINCT o) AS orderCount ORDER BY orderCount DESC RETURN e.employeeID AS employeeID, p2.productName AS otherProduct, orderCount LIMIT 5 " |
{ "otherProduct": "Gumbär Gummibärchen", "employeeID": "3", "orderCount": 12 } |
"cypher MATCH (p:Product {productName: 'Chocolade'})←[:CONTAINS]-(:Order)←[:SOLD]-(e:Employee) MATCH (e)-[:SOLD]→(o:Order)-[:CONTAINS]→(p2:Product) WITH e, p2, COUNT(DISTINCT o) AS orderCount ORDER BY orderCount DESC RETURN e.employeeID AS employeeID, p2.productName AS otherProduct, orderCount LIMIT 5 " |
{ "otherProduct": "Flotemysost", "employeeID": "1", "orderCount": 12 } |
"cypher MATCH (p:Product {productName: 'Chocolade'})←[:CONTAINS]-(:Order)←[:SOLD]-(e:Employee) MATCH (e)-[:SOLD]→(o:Order)-[:CONTAINS]→(p2:Product) WITH e, p2, COUNT(DISTINCT o) AS orderCount ORDER BY orderCount DESC RETURN e.employeeID AS employeeID, p2.productName AS otherProduct, orderCount LIMIT 5 " |
{ "otherProduct": "Pavlova", "employeeID": "1", "orderCount": 11 } |
与不使用自然语言模式描述的程序相比,输出产生的幻觉更少,例如不同标签持有的属性和链接到其他实体的关系。
用自然语言描述图模型
此程序 apoc.ml.schema 返回底层数据集的自然语言描述。
它使用 此处记录的 chat/completions API。
CALL apoc.ml.schema() yield value
RETURN *
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| value |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "The graph database schema represents a system where users can follow other users and review movies. Users (:Person) can either follow other users (:Person) or review movies (:Movie). The relationships allow users to express their preferences and opinions about movies. This schema can be compared to social media platforms where users can follow each other and leave reviews or ratings for movies they have watched. It can also be related to movie recommendation systems where user preferences and reviews play a crucial role in generating personalized recommendations." |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row
| 名称 (name) | description(描述) |
|---|---|
conf |
可选的配置映射,请查看下一节 |
| 名称 (name) | description(描述) | 强制性 (mandatory) |
|---|---|---|
apiKey |
OpenAI API 密钥 |
在未定义 |
model |
OpenAI 模型 |
否,默认 |
sample |
要跳过的节点数量,例如 1000 的样本将读取每第 1000 个节点。它作为参数传递给计算模式的 |
否,默认为随机数 |
| 名称 (name) | description(描述) |
|---|---|
值 |
数据集的描述 |
从自然语言查询创建 Cypher 查询
此程序 apoc.ml.cypher 接收一个自然语言问题,并将其转换为若干个请求的 Cypher 查询。
它使用 此处记录的 chat/completions API。
CALL apoc.ml.cypher("Who are the actors which also directed a movie?", {count: 4}) yield cypher
RETURN *
+----------------------------------------------------------------------------------------------------------------+
| query |
+----------------------------------------------------------------------------------------------------------------+
| "
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person)
RETURN a.name as actor, d.name as director
" |
| "cypher
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(a)
RETURN a.name
" |
| "
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person)
RETURN a.name
" |
| "cypher
MATCH (a:Person)-[:ACTED_IN]->(:Movie)<-[:DIRECTED]-(a)
RETURN DISTINCT a.name
" |
+----------------------------------------------------------------------------------------------------------------+
4 rows
| 名称 (name) | description(描述) | 强制性 (mandatory) |
|---|---|---|
question |
自然语言形式的问题 |
是 |
conf |
可选的配置映射,请查看下一节 |
否 |
| 名称 (name) | description(描述) | 强制性 (mandatory) |
|---|---|---|
count |
要检索的查询数量 |
否,默认 |
apiKey |
OpenAI API 密钥 |
在未定义 |
model |
OpenAI 模型 |
否,默认 |
sample |
要跳过的节点数量,例如 1000 的样本将读取每第 1000 个节点。它作为参数传递给计算模式的 |
否,默认为随机数 |
| 名称 (name) | description(描述) |
|---|---|
值 |
数据集的描述 |
我们可以使用 additionalPrompts 配置来改进请求,例如添加模式的自然语言描述(例如 apoc.ml.schema 的输出)。由于 OpenAI 主要针对自然语言问题而非 Cypher 查询进行训练,使用此配置可以获得更好的结果。例如,给定 Northwind 数据集,我们可以执行
CALL apoc.ml.schema({apiKey: $apiKey}) YIELD value
WITH value
CALL apoc.ml.cypher("Which 5 employees had sold the product 'Chocolade' and has the highest selling count of another product?
Please returns the employee identificator, the other product name and the count orders of another product",
{
count: 1,
apiKey: $apiKey,
additionalPrompts: [
{role: "system", content: "The human description of the schema is the following:\n" + value}
]
})
YIELD value RETURN value
结果类似于以下内容。
| 结果不是确定性的,并且每次重新执行查询时可能会发生变化 |
| 值 |
|---|
MATCH (p:Product {productName: 'Chocolade'})←[:CONTAINS]-(o:Order)←[:SOLD]-(e:Employee) MATCH (e)-[:SOLD]→(o2:Order)-[:CONTAINS]→(p2:Product) WITH e, p2, COUNT(DISTINCT o2) AS ordersCnt ORDER BY ordersCnt DESC RETURN e.employeeID AS employeeID, p2.productName AS otherProduct, ordersCnt LIMIT 5 |
与不使用自然语言模式描述的程序相比,输出产生的幻觉更少,例如不同标签持有的属性和链接到其他实体的关系。
从 Cypher 查询创建自然语言查询解释
此程序 apoc.ml.fromCypher 接收一个自然语言问题,并将其转换为自然语言查询解释。
它使用 此处记录的 chat/completions API。
CALL apoc.ml.cypher("MATCH (p:Person {name: "Tom Hanks"})-[:ACTED_IN]->(m:Movie) RETURN m", {}) yield value
RETURN *
| 值 |
|---|
此数据库模式代表了一个通用电影数据库模型的简化版本。 |
| 名称 (name) | description(描述) | 强制性 (mandatory) |
|---|---|---|
cypher |
自然语言形式的问题 |
是 |
conf |
可选的配置映射,请查看下一节 |
否 |
| 名称 (name) | description(描述) | 强制性 (mandatory) |
|---|---|---|
retries |
API 调用失败时的重试次数 |
否,默认 |
apiKey |
OpenAI API 密钥 |
在未定义 |
model |
OpenAI 模型 |
否,默认 |
sample |
要跳过的节点数量,例如 1000 的样本将读取每第 1000 个节点。它作为参数传递给计算模式的 |
否,默认为随机数 |
| 名称 (name) | description(描述) |
|---|---|
值 |
数据集的描述 |
从一组查询创建子图解释
此程序 apoc.ml.fromQueries 返回给定查询集的自然语言解释。
它使用 此处记录的 chat/completions API。
CALL apoc.ml.fromQueries(['MATCH (n:Movie) RETURN n', 'MATCH (n:Person) RETURN n'],
{apiKey: <apiKey>})
YIELD value
RETURN *
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| value |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "The database represents movies and people, like in a movie database or social network.
There are no defined relationships between nodes, allowing flexibility for future connections.
The Movie node includes properties like title, tagline, and release year." |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row
CALL apoc.ml.fromQueries(['MATCH (n:Movie) RETURN n', 'MATCH p=(n:Movie)--() RETURN p'],
{apiKey: <apiKey>})
YIELD value
RETURN *
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| value |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "models relationships in the movie industry, connecting :Person nodes to :Movie nodes.
It represents actors, directors, writers, producers, and reviewers connected to movies they are involved with.
Similar to a social network graph but specialized for the entertainment industry.
Each relationship type corresponds to common roles in movie production and reviewing.
Allows for querying and analyzing connections and collaborations within the movie business." |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row
| 名称 (name) | description(描述) |
|---|---|
queries |
查询列表 |
conf |
可选的配置映射,请查看下一节 |
| 名称 (name) | description(描述) | 强制性 (mandatory) |
|---|---|---|
apiKey |
OpenAI API 密钥 |
在未定义 |
model |
OpenAI 模型 |
否,默认 |
sample |
要跳过的节点数量,例如 1000 的样本将读取每第 1000 个节点。它作为参数传递给计算模式的 |
否,默认为随机数 |
| 名称 (name) | description(描述) |
|---|---|
值 |
数据集的描述 |