MCP Toolbox Neo4j 集成

模型上下文协议 (Model Context Protocol, MCP) 工具箱是一个开源框架，充当 AI 智能体（如大语言模型）与外部数据源（如 Neo4j）之间的语义层。它旨在通过将智能体的请求转换为预定义的、安全的数据库查询，来促进安全、结构化的数据交互。

与那些依赖 LLM 即时生成查询的系统不同，MCP 工具箱采用基于工具的方法。开发者定义特定的查询和操作。随后，LLM 只需选择正确的工具并传递必要的参数，这不仅确保了准确性和安全性，还防止了 LLM 进行编造或产生幻觉。

安装

开始使用前，你需要确保 Neo4j 和 MCP 工具箱都已运行。

Neo4j： 确保你有一个正在运行的 Neo4j 实例（本地或云端）。

MCP 工具箱： 使用 Homebrew 安装工具箱。

# Example for macOS with Homebrew
brew install mcp-toolbox

数据源配置

Neo4j 数据源指定了与 Neo4j 数据库的连接。它在 tools.yaml 文件中进行配置，是定义任何 Neo4j 特定工具的前提。

表 1. 参考：`neo4j` 数据源
字段	类型	描述
`kind`	string	必需。必须为 `neo4j`。
`uri`	string	必需。Neo4j 数据库实例的 URI。
`user（用户）`	string	必需。数据库认证的用户名。
`password`	string	必需。数据库认证的密码。强烈建议在生产环境中使用环境变量。

工具定义与功能

该工具箱提供了三种特定的 Neo4j 工具类型，每种类型都有不同的用途：

neo4j-cypher：这是最常用的工具。它执行预定义的、参数化的 Cypher 查询。查询由开发者在 YAML 文件中定义（如上述示例所示）。这是最安全的方法，因为 LLM 无法更改查询逻辑。
neo4j-execute-cypher：此工具专为更灵活的用例设计，例如开发者助手工作流。它将任意 Cypher 字符串作为参数并执行它。出于安全考虑，可以将其配置为 readOnly: true 以防止写入操作（如 CREATE、MERGE 或 DELETE）。不建议在生产环境的智能体中使用此工具。
neo4j-schema：此工具用于提取完整的 Neo4j 数据库模式（Schema）。它不需要任何参数，并提供结构化的 JSON 输出。这对为 LLM 提供数据模型上下文极其有用，使其能够构建更复杂的查询，然后由 neo4j-execute-cypher 工具处理（或用于创建新的 neo4j-cypher 工具）。

使用示例

MCP 工具箱通过其工具类型与 Neo4j 集成。

集成的核心是 tools.yaml 文件，你可以在其中定义与 Neo4j 实例的连接以及为智能体准备的特定工具。

sources:
  my-neo4j-source:
    kind: neo4j
    uri: bolt://:7687
    user: neo4j
    password: my-password # Use environment variables in production

tools:
  search-movies-by-actor:
    kind: neo4j-cypher
    source: my-neo4j-source
    description: "Searches for movies an actor has appeared in based on their name. Useful for questions like 'What movies has Tom Hanks been in?'"
    parameters:
      - name: actor_name
        type: string
        description: The full name of the actor to search for.
    statement: |
      MATCH (p:Person {name: $actor_name}) -[:ACTED_IN]-> (m:Movie)
      RETURN m.title AS title, m.year AS year, m.genre AS genre

  get-actor-for-movie:
    kind: neo4j-cypher
    source: my-neo4j-source
    description: "Finds the actors who starred in a specific movie. Useful for questions like 'Who acted in Inception?'"
    parameters:
      - name: movie_title
        type: string
        description: The exact title of the movie.
    statement: |
      MATCH (p:Person) -[:ACTED_IN]-> (m:Movie {title: $movie_title})
      RETURN p.name AS actor

  find-nearest-cinema:
    kind: neo4j-cypher
    source: my-neo4j-source
    description: "Find the nearest cinema to a given city. The city must be an exact match."
    parameters:
      - name: city_name
        type: string
        description: The name of the city to find cinemas near.
    statement: |
      MATCH (city:City {name: $city_name})
      MATCH (cinema:Cinema)
      WITH
          city.latitude AS fromLat,
          city.longitude AS fromLon,
          cinema.latitude AS toLat,
          cinema.longitude AS toLon,
          cinema.name AS cinemaName
      RETURN
          cinemaName,
          point.distance(
              point({latitude: fromLat, longitude: fromLon}),
              point({latitude: toLat, longitude: toLon})
          ) AS distance
      ORDER BY distance
      LIMIT 1

  get-movie-list:
    kind: neo4j-cypher
    source: my-neo4j-source
    description: "Get a list of movies, optionally filtering by year. This is a very useful general tool for getting movies."
    parameters:
      - name: year
        type: integer
        optional: true
        description: The year the movie was released.
    statement: |
      MATCH (movie:Movie)
      WHERE $year IS NULL OR movie.released = $year
      RETURN movie.title, movie.released
      ORDER BY movie.released DESC
      LIMIT 10

  get-movie:
    kind: neo4j-cypher
    source: my-neo4j-source
    description: "Get all information about a specific movie by its title. If a user asks a question about a movie and provides the title, this is the tool to use."
    parameters:
      - name: title
        type: string
        description: The title of the movie.
    statement: |
      MATCH (movie:Movie {title: $title})
      OPTIONAL MATCH (movie)<-[:ACTED_IN]-(actor)
      OPTIONAL MATCH (movie)<-[:DIRECTED]-(director)
      RETURN movie, collect(actor.name) AS actors, collect(director.name) AS directors

运行工具

配置好 tools.yaml 文件后，即可启动服务器。

toolbox --tools-file "tools.yaml"

通过 API 交互： 工具箱为调用定义的工具提供了 REST API。AI 智能体（或用于测试的 curl）可以调用这些端点。

# Example: Invoke 'search-movies-by-actor'
curl -X POST http://127.0.0.1:5000/api/tool/search-movies-by-actor/invoke \
-H "Content-Type: application/json" \
-d '{
  "actor_name": "Tom Hanks"
}'

示例：公共公司数据集

此示例连接到公共的 "Companies" 演示数据库，并定义了一个工具来回答以下问题：“按字母顺序排列的前 5 家机构是什么？”

1. 配置 (`tools_companies.yaml`)

此配置定义了一个工具 get-organizations-alphabetical，用于检索按名称排序的机构列表。

sources:
  companies-demo:
    kind: neo4j
    uri: neo4j+s://demo.neo4jlabs.com:7687
    user: companies
    password: companies
    database: companies

tools:
  get-schema:
    kind: neo4j-schema
    source: companies-demo
    description: "Extracts the database schema."

  get-organizations-alphabetical:
    kind: neo4j-cypher
    source: companies-demo
    description: "Retrieves a list of organizations sorted alphabetically."
    parameters:
      - name: limit
        type: integer
        description: "The number of results to return."
    statement: |
      MATCH (o:Organization)
      RETURN o.name as OrganizationName
      ORDER BY o.name ASC
      LIMIT $limit

2. Python 客户端 (`client.py`)

此脚本模拟了一个 AI 智能体。它连接到工具箱并调用该工具以获取前 5 家机构。

import asyncio
import os
from mcp import StdioServerParameters
from mcp.client.stdio import stdio_client
from mcp import ClientSession

# Configuration: Launch toolbox with the companies yaml
server_params = StdioServerParameters(
    command="toolbox",
    args=["--tools-file", "tools_companies.yaml", "--stdio"],
    env=os.environ.copy()
)

async def main():
    print("Connecting to MCP Toolbox (Companies Demo)...")

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            print("Connected!")

            print("\n--- Question: What are first 5 organizations alphabetically? ---")

            result = await session.call_tool(
                "get-organizations-alphabetical",
                arguments={
                    "limit": 5
                }
            )

            # Display the result
            for content in result.content:
                print(content.text)

if __name__ == "__main__":
    asyncio.run(main())

3. 如何运行客户端

要运行上述示例，请遵循以下步骤：

安装依赖： 你需要 mcp Python 包。[source,shell] ---- pip install mcp ----
保存文件：
- 将 YAML 配置保存为 tools_companies.yaml。
- 将 Python 代码保存为 client.py。
运行脚本： [source,shell] ---- python client.py ----

语义层及其价值

MCP 工具箱在 Neo4j 数据库之上提供了一个语义层，它是允许 LLM 直接生成查询的一种强大替代方案。这种方法确保了：

安全性： 它通过不允许 LLM 创建自己的（潜在有害的）查询来防止“Cypher 注入”攻击。所有交互都通过你定义好的、经过预先批准的安全 Cypher 语句进行。
准确性： 它保证 LLM 能够执行针对特定任务的最正确且最高效的查询，从而避免了自然语言到代码转换中常见的陷阱。
可预测性： 结果是一致的，因为底层的 Cypher 查询是固定的，由开发者控制，而非受 LLM 的随机性影响。

正如 Neo4j 文章所强调的，工具箱的核心优势在于其能够“提供一个大语言模型可以依赖的一致 API”。LLM 无需尝试“猜测”数据库结构，只需选择合适的工具，剩下的交给工具箱处理。这种方法是构建可靠且可扩展的企业级应用的关键。

使用智能体：对话示例

本节模拟了与由 MCP 工具箱驱动的 AI 智能体的自然语言对话，展示了智能体如何选择和使用工具来提供准确的答案。

用户： “1999 年上映了哪些电影？”

智能体的思考过程： 智能体分析问题并意识到需要关于特定年份电影的信息。它确定 get-movie-list 工具最合适，并提取了 year 参数。
工具调用： 智能体调用工具箱 API：POST /api/tool/get-movie-list/invoke，参数为 {"year": 1999}。
工具箱响应： 工具箱执行 Cypher 查询并返回 1999 年的电影列表。
智能体的最终回复： “以下是 1999 年上映的一些电影：《黑客帝国》、《搏击俱乐部》、《第六感》。”

用户： “谁主演了《黑客帝国》？”

智能体的思考过程： 智能体识别到请求特定电影的演员。它确定了 get-movie 工具，因为其描述符合用户意图。它提取了 title 参数。
工具调用： 智能体调用工具箱 API：POST /api/tool/get-movie/invoke，参数为 {"title": "The Matrix"}。
工具箱响应： 工具箱执行 Cypher 查询并返回 《黑客帝国》 的详细信息，包括演员列表。
智能体的最终回复： “《黑客帝国》的主演是基努·里维斯、劳伦斯·菲什伯恩、凯莉-安·莫斯和雨果·维文。”

此过程突显了 MCP 工具箱如何充当可靠且安全的函数调用层。LLM 的职责是解释用户意图和参数，而工具箱则处理与数据库之间复杂且安全的交互。

表 2. 相关链接
分类	链接
文档	MCP 工具箱文档
仓库	MCP 工具箱 GitHub 仓库
博客	Neo4j 开发者博客文章