WIP: Pydantic AIでエージェントを作る

しむどん： 2025-09-07

2025年09月04日、Pydantic AIのバージョンがv1.0.0に到達した。そこでPydantic AIについて調べ、AIエージェントを作っていく。

Pydantic AIとは
コンテナ環境
シンプルなエージェント

Pydantic AIとは

Pydantic AI is a Python agent framework designed to help you quickly, confidently, and painlessly build production grade applications and workflows with Generative AI.

Pydantic AIのGitHubリポジトリにある説明

Pydantic AIは、ジェネレーティブAIを用いた、迅速に自信を持ってストレスなく本番環境で稼働するアプリケーションやワークフローを構築するためのPythonエージェントフレームワークです。

ChatGPTによる日本語訳

簡単に言うとPydantic AIはPython製のエージェントフレームワークの1つだ。

コンテナ環境

実行する環境を整備していく。

FROM python:3.13-slim-bookworm

WORKDIR /
COPY ./requirements.txt ./requirements.txt
COPY ./constraints.txt ./constraints.txt
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -r ./requirements.txt -c constraints.txt

CMD ["bash"]

Dockerfile

pydantic-ai-slim[openai]

requirements.txt

annotated-types==0.7.0
anyio==4.10.0
certifi==2025.8.3
colorama==0.4.6
distro==1.9.0
genai-prices==0.0.25
griffe==1.14.0
h11==0.16.0
httpcore==1.0.9
httpx==0.28.1
idna==3.10
importlib_metadata==8.7.0
jiter==0.10.0
logfire-api==4.4.0
openai==1.106.1
opentelemetry-api==1.36.0
pydantic==2.11.7
pydantic-ai-slim==1.0.1
pydantic-graph==1.0.1
pydantic_core==2.33.2
sniffio==1.3.1
tqdm==4.67.1
typing-inspection==0.4.1
typing_extensions==4.15.0
zipp==3.23.0

constraints.txt

このイメージをビルドする。

docker build -t pydanticai:202509 .

イメージをビルドできたら、コンテナを起動する。

docker run -it --volume $PWD:$PWD --workdir $PWD pydanticai:202509 bash

ここからは、このコンテナ上で作業を進めていく。

シンプルなエージェント

プロンプトを1つ入力し、その結果を表示するだけの、シンプルなエージェントから始めよう。モデルは Jan-v1-4B を使用し、LM Studioで整備したモデルとAPIを使用する。モデルとLM Studioはホスト側にあるものとする。

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

model = OpenAIChatModel(
    'jan-v1-4b',
    provider=OpenAIProvider(
        base_url='http://host.docker.internal:1234/v1',
        api_key='DUMMY'
    ),
)
agent = Agent(model)

このエージェントを使ってリクエストを送信する。

result = agent.run_sync('おはよう')

agent.run_sync は AgentRunResult を返す。以下に結果の例を示す。

AgentRunResult(output='\n\nGood morning! 😊 How can I assist you today?')

シェルにアクセスしコマンドを実行する

エージェントをローカル環境で変わりに動作させるためには、様々な機能をエージェントに持たせていく必要がある。例えばファイルを読んだり書いたり、ディレクトリを作ったり削除したりといった、普段行っている一つ一つの動作を実装していては日が暮れてしまう。

人であればその環境でどのようなコマンドが使えるかを把握し、それを使って様々な操作を行っている。つまり、シェルへのインターフェースと、どのようなコマンドが使えるのかという事を知る手段さえあれば、エージェント自身も人と同じような行動が取れる。少し危険ではあるが、Agentが人と同等の事をできるようにした。

process_list = []

@agent.tool_plain
def start_process_shell_command(command: str) -> str:
    """
    Execute command via shell program.
    """
    args = ["/bin/bash", "-c", command]
    print(args)
    child = subprocess.Popen(args)
    process_list.append(child)
    return f"Starting process: {command}"

ストリーミングで出力させる

結果はやはりストリーミングで確認したい。

エージェントを拡張する

僕達はそんな事をしなくても、その環境でどのようなコマンドが使えるかを把握し、それを使って様々な操作を行っている。つまり、シェルへのインターフェースと、どのようなコマンドが使えるのかという事を知る手段さえあれば、エージェント自身も人と同じような行動が取れる。

実行可能なコマンドを把握させる

ツールを組み込む

Pydantic AI では、デコレータでツールを追加できる。サイコロを振るツールの例がドキュメントに掲載されている。それを試してみよう。

import random

@agent.tool_plain
def roll_dice() -> str:
    """Roll a six-sided die and return the result."""
    return str(random.randint(1, 6))

エージェントにプロンプトを渡し、処理を実行してみよう。

result = agent.run_sync('サイコロ振ってみて')

resultには以下の結果が返された。

AgentRunResult(output='\n\nサイコロを振りました。結果は3です。')

良さそうだ。

ツールでファイルを読み取る

はcatを実装しようか。その機能によってファイルをLLMに転送する事ができるようになる。

import tempfile
import subprocess

@agent.tool_plain
def read_file(filepath: str) -> str:
    """Read file."""
    fp = tempfile.TemporaryFile()
    child_process = subprocess.Popen(["cat", filepath], stdout=fp)
    child_process.wait()
    fp.seek(0)
    return fp.read()

ファイル名を指定して中身を確認するように指示した時に、ツールを使ってファイルの中身を読み取る事ができれば成功だ。

agent.run_sync('memo.txtには何て書いてあるか教えて')

次のように正しい応答が返された。

AgentRunResult(output='\n\nThe content of `memo.txt` is a Base64-encoded string. Decoding `"SGVsbG8gd29ybGQhCgoK"` gives the message:\n\n**"Hello world!"**')

ファイルの一部分を変更する

ファイルの内容を読む事はできるようになったから、次はファイルの中身を変更するツールを実装する。ファイルの中身を変更する方法もいくつか考えられるが、ここではsedを使って行単位で変更できるようにする。

agent.run_sync('使用可能なコマンドを使い、memo.txtの2行目を「Nice to see you.」に変更して。')

これを実行した時にファイルの変更を行うsedコマンドを実行できれば成功だ。

agent.run_sync('使用可能なコマンドを把握せよ。')

APIの挙動を確認する

:ORIGIN = http://127.0.0.1:1234

GET :ORIGIN/v1/models

{
  "data": [
    {
      "id": "jan-v1-4b",
      "object": "model",
      "owned_by": "organization_owner"
    },
    {
      "id": "text-embedding-nomic-embed-text-v1.5",
      "object": "model",
      "owned_by": "organization_owner"
    }
  ],
  "object": "list"
}
// GET http://127.0.0.1:1234/v1/models
// HTTP/1.1 200 OK
// X-Powered-By: Express
// Content-Type: application/json; charset=utf-8
// Content-Length: 269
// ETag: W/"10d-rqq/o+bEBh2RR+iCAVue7V4F+WY"
// Date: Thu, 02 Oct 2025 01:50:36 GMT
// Connection: keep-alive
// Keep-Alive: timeout=5
// Request duration: 0.076014s

現在読み込まれているモデルを一覧表示

:ORIGIN = http://127.0.0.1:1234

POST :ORIGIN/v1/chat/completions
Content-Type: application/json

{
  "model": "jan-v1-4b",
  "messages": [
    {"role": "system", "content": "言語指定: 日本語"},
    {"role": "user", "content": "hello"}
  ]
}

{
  "id": "chatcmpl-chi4ue2mhkhfdkgi8pmlv",
  "object": "chat.completion",
  "created": 1759370551,
  "model": "jan-v1-4b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "<think>\nOkay, the user said \"hello\". I need to respond in Japanese as per their language specification. Let me think... The first thing is to greet them back.\n\nIn Japanese, a common greeting response would be \"こんにちは\" for general hello during the day. Wait, maybe they're using it at any time, so perhaps use こんにちは or おはよう depending on the time? But since I don't know their time zone or when they're messaging now, just go with the standard.\n\nThe user might expect a friendly reply. Let me phrase it in natural Japanese.\n\nI should write something like \"こんにちは！どうぞよろしくお願いします。\" but maybe too formal. Maybe more casual for chat: \"こんにちは！今何をしてるか分からんけど、元気ですか？\"\n\nAlternatively, keep it simple and warm.\n\nWait, the task is just to respond in Japanese to \"hello\". So maybe start with a greeting followed by an open question or statement.\n\nI think I'll do this to be safe. Let's go for: こんにちは！最近どうぞよろしくお願いします。 But perhaps better to write something like こんにちは！お元気ですか？\n\nYes, that's standard Japanese hello response.\n\nSo final thought: \"こんにちは！お元気ですか？\"\n</think>\n\nこんにちは！お元気ですか？",
        "tool_calls": []
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 21,
    "completion_tokens": 274,
    "total_tokens": 295
  },
  "stats": {},
  "system_fingerprint": "jan-v1-4b"
}
// POST http://127.0.0.1:1234/v1/chat/completions
// HTTP/1.1 200 OK
// X-Powered-By: Express
// Content-Type: application/json; charset=utf-8
// Content-Length: 1817
// ETag: W/"719-12nWGzF3+biWESycncLWc2LfOSo"
// Date: Thu, 02 Oct 2025 02:02:57 GMT
// Connection: keep-alive
// Keep-Alive: timeout=5
// Request duration: 25.563936s

チャット補完。チャット履歴をモデルに送信して、次のアシスタントレスポンスを予測します

:ORIGIN = http://127.0.0.1:1234

POST :ORIGIN/v1/embeddings
Content-Type: application/json

{
    "model": "text-embedding-nomic-embed-text-v1.5",
    "input": "こんにちわ"
    
}

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [
        -0.0029755248688161373,
        -0.01683039404451847,
        〜省略〜
        -0.06586568057537079,
        -0.02217249944806099
      ],
      "index": 0
    }
  ],
  "model": "text-embedding-nomic-embed-text-v1.5",
  "usage": {
    "prompt_tokens": 0,
    "total_tokens": 0
  }
}
// POST http://127.0.0.1:1234/v1/embeddings
// HTTP/1.1 200 OK
// X-Powered-By: Express
// Content-Type: application/json; charset=utf-8
// Content-Length: 23377
// ETag: W/"5b51-CYs9mGjiN/pIqrTH85YhMmQ/lt4"
// Date: Thu, 02 Oct 2025 01:54:03 GMT
// Connection: keep-alive
// Keep-Alive: timeout=5
// Request duration: 0.040796s

テキストの埋め込み