You are now implementing the real application in this workspace.

The target implementation language is LANGUAGE.

This workspace should already contain a verified LANGUAGE project skeleton with basic support for CLI, HTTP, SQLite, config parsing, tests, and build/test commands. Begin by inspecting the existing workspace and README before changing anything.

Your first task is to name the application.

The placeholder name is:

  *Claw

Replace the wildcard with a short, distinctive prefix suitable for this implementation.

Do not use an existing project’s name or branding. Once you choose the name, use it consistently for:

  - executable name
  - README title
  - config directory
  - default workspace directory
  - log names
  - test names where appropriate

If the selected name is not suitable for an executable on macOS, create a lowercase/kebab-case executable form and document the relationship. For example:

  Application name: LogicClaw
  Executable: logicclaw

The goal is to build a small, local-first agent runtime. It should run as a single command-line application that can load configuration, talk to a model provider, expose a CLI agent loop, execute a small set of tools through a security gate, persist memory, and write tamper-evident tool receipts.

Do not mention or depend on any external agent runtime project. Treat this as a clean-room implementation from this spec.

Core workflow:

  app init
  app config validate
  app config show
  app provider list
  app provider test NAME
  app tool list
  app tool run NAME --json ARGS
  app agent
  app agent -m "What files are in this project?"
  app memory search "previous topic"
  app receipt verify
  app estop

Use the final executable name you selected instead of `app`.

The agent must:

  - accept input from the CLI channel
  - send the conversation to a configured model provider
  - advertise available tools to the model
  - parse tool calls from the model response
  - validate each tool call through a security policy
  - execute approved tools
  - feed tool results back into the model
  - persist the final exchange, tool calls, tool results, and receipts
  - return a final answer to the user

Required architecture:

The implementation must have visible separation of responsibility for these areas:

  runtime
    agent loop, request lifecycle, orchestration

  config
    config loading, validation, defaults, path expansion

  providers
    model provider abstraction and concrete providers

  channels
    CLI channel and optional HTTP/gateway channel

  tools
    time, file_list, file_read, file_write, shell, http, memory_search

  security
    autonomy levels, command/path policy, tool-risk classification

  memory
    SQLite persistence, or JSONL only if SQLite is impractical in LANGUAGE

  receipts
    tamper-evident tool-call receipts

  sop
    optional deterministic workflow runner

  service
    optional install/start/stop/status wrappers

Do not force object-oriented structure if LANGUAGE is not object-oriented. Use idiomatic LANGUAGE design, but preserve the conceptual boundaries.

Configuration:

The application must use a user-editable config file.

Default location should be based on the final app name, for example:

  ~/.logicclaw/config.toml

TOML is preferred. JSON, INI, S-expression, or another idiomatic config format is acceptable if TOML support is weak in LANGUAGE. If not using TOML, document why.

Minimum config shape:

  workspace_dir = "~/logicclaw-workspace"
  default_provider = "local"
  default_model = "mock"

  [security]
  autonomy = "supervised"
  workspace_only = true
  forbidden_paths = ["/etc", "/sys", "/boot", "~/.ssh"]
  forbidden_commands = ["rm", "shutdown", "reboot", "mkfs", "dd"]
  audit_log = true

  [providers.models.local]
  kind = "mock"
  model = "mock"

  [providers.models.openai_compatible]
  kind = "openai-compatible"
  base_url = "http://localhost:1234/v1"
  model = "local-model"
  api_key_env = "OPENAI_API_KEY"

  [channels.cli]
  enabled = true
  tools_allow = ["file_read", "file_list", "time", "memory_search", "shell"]

  [memory]
  backend = "sqlite"
  path = "~/.logicclaw/memory.sqlite"

  [receipts]
  enabled = true
  path = "~/.logicclaw/tool_receipts.log"

Adjust paths to match the application name you chose.

Config requirements:

  - load defaults when keys are absent
  - expand ~ and environment variables
  - validate enum values
  - validate that workspace exists or create it during init
  - do not require API keys for mock mode
  - support provider credentials by environment variable
  - never print secret values in logs or config dumps
  - config validate must report all detected errors in one pass when practical

Provider abstraction:

Create an idiomatic equivalent of:

  Provider
    name() -> string
    capabilities() -> ProviderCapabilities
    chat(request: ChatRequest) -> ChatResponse

ChatRequest must contain:

  - system_prompt
  - messages
  - tools
  - model
  - optional temperature
  - optional metadata

ChatResponse must contain:

  - final_text
  - tool_calls
  - optional raw_provider_payload
  - optional usage

Required providers:

  mock
    Deterministic provider used for tests. It must be able to return ordinary text and tool calls from scripted fixtures.

  openai-compatible
    Sends requests to an OpenAI-compatible /chat/completions endpoint. Full support for every provider is not required. Implement non-streaming chat completion. Tool/function call support is required if reasonably practical in LANGUAGE; otherwise document the limitation clearly.

Optional providers:

  reliable
    Wrapper provider that tries provider names in order and falls back on network/auth/timeout errors.

  router
    Wrapper provider that chooses a provider from request metadata hints.

Channel abstraction:

Create an idiomatic equivalent of:

  Channel
    name() -> string
    start(runtime_handle)
    send(conversation_id, message)
    supports_draft_updates() -> bool

Required channel:

  cli

CLI behavior:

  app agent
    starts a REPL

  app agent -m "message"
    runs one turn and exits

REPL commands:

  /exit
    exits

  /tools
    lists active tools

  /memory <query>
    searches memory

  /policy
    prints current autonomy and workspace boundary

Optional gateway channel:

  localhost HTTP server

Minimum optional gateway endpoints:

  GET  /health
  GET  /status
  GET  /tools
  POST /chat
  GET  /memory/search?q=...
  GET  /receipts
  POST /estop

Tool abstraction:

Create an idiomatic equivalent of:

  Tool
    name() -> string
    description() -> string
    parameters_schema() -> JSON Schema object or equivalent metadata
    risk(args, context) -> low | medium | high
    invoke(args, context) -> ToolResult

ToolResult must contain:

  - success: bool
  - output: string
  - optional error: string
  - optional metadata
  - optional receipt_id

Required built-in tools:

  time
    Returns current local time, UTC time, and timezone if available.

  file_list
    Lists files under a path inside workspace.

  file_read
    Reads a UTF-8 text file inside workspace.

  file_write
    Writes a UTF-8 text file inside workspace.

  shell
    Executes a shell command inside workspace, subject to security policy.

  http
    Performs HTTP GET. POST is optional.

  memory_search
    Searches persisted conversations.

Optional tools:

  web_search
    May be stubbed unless a search API key is configured.

  pdf_extract
    Optional.

  ask_user
    In CLI mode, asks the user a question and returns the answer.

Security model:

Implement three autonomy levels:

  readonly
    Low-risk read-only tools allowed.
    No file_write.
    No shell execution except optionally harmless commands such as pwd.

  supervised
    Low-risk tools run automatically.
    Medium-risk tools require operator approval.
    High-risk tools are blocked.

  full
    Low and medium run automatically.
    High-risk is still blocked if explicitly forbidden by path or command policy.

Default must be:

  supervised

Risk rules:

  time, memory_search, file_list, file_read inside workspace:
    low

  http GET to allowed domains:
    low

  file_write inside workspace:
    medium

  shell command from allowlist:
    medium

  shell command not on allowlist:
    high

  any path outside workspace when workspace_only = true:
    blocked

  any path under forbidden_paths:
    blocked

  any command whose basename appears in forbidden_commands:
    blocked

Any shell command containing obvious destructive patterns must be blocked. Minimum patterns:

  rm -rf /
  rm -rf *
  mkfs
  dd if=
  :(){ :|:& };:
  shutdown
  reboot
  chmod -R 777 /
  chown -R
  curl ... | sh
  wget ... | sh

Approval flow in CLI mode:

When a medium-risk action requires approval, print something like:

  Tool request:
    tool: file_write
    risk: medium
    reason: writes to workspace
    args: ...
  Approve? [y/N]

Default is deny.

Tool receipts:

Every attempted tool invocation must produce a receipt whether it is allowed, denied, failed, or approved.

Receipt fields:

  {
    "id": "receipt-...",
    "timestamp": "2026-05-12T14:00:00Z",
    "conversation_id": "...",
    "tool": "file_read",
    "args_hash": "...",
    "result_hash": "...",
    "status": "allowed|denied|failed",
    "risk": "low|medium|high",
    "previous_hash": "...",
    "receipt_hash": "..."
  }

Receipt hash:

  receipt_hash = SHA256(canonical_json(receipt_without_receipt_hash))

Tamper-evident chain:

  - each receipt includes the previous receipt’s hash
  - receipt verify must replay the log
  - it must report the first broken link

Optional stronger version:

  HMAC-SHA256 with a locally stored secret key

Memory:

Use SQLite if practical in LANGUAGE. Use JSONL only if SQLite support is impractical or broken.

Persist:

  - conversation_id
  - turn_id
  - timestamp
  - role
  - content
  - tool_calls
  - tool_results
  - provider
  - model
  - metadata

Required commands:

  app memory search QUERY
  app memory show CONVERSATION_ID
  app memory list
  app memory clear --yes

Search may be simple substring search.

Optional scoring:

  - tokenize query and content
  - rank by term frequency
  - boost recent conversations

Agent loop:

Implement this loop:

  1. Receive user message from channel.
  2. Create or resume conversation.
  3. Load recent memory context.
  4. Build system prompt.
  5. Build tool schemas from active tools.
  6. Call provider.
  7. If provider returns text only, persist and reply.
  8. If provider returns tool calls:
       a. For each tool call, classify risk.
       b. Validate policy.
       c. Ask approval when required.
       d. Invoke or deny.
       e. Write receipt.
       f. Persist tool call and result.
  9. Send tool results back to provider.
  10. Repeat until final text or max_tool_rounds is reached.
  11. Persist final assistant response.
  12. Reply to channel.

Guardrails:

  max_tool_rounds default:
    5

  max_response_bytes default:
    1 MB

  tool execution timeout default:
    30 seconds

  shell timeout default:
    15 seconds

  HTTP timeout default:
    20 seconds

The runtime must not recursively invoke tools forever.

Required CLI command surface:

  app init
  app onboard
  app config validate
  app config show
  app provider list
  app provider test NAME
  app tool list
  app tool run NAME --json ARGS
  app agent
  app agent -m MESSAGE
  app memory list
  app memory search QUERY
  app memory show CONVERSATION_ID
  app receipt list
  app receipt verify
  app estop

Optional commands:

  app service install
  app service start
  app service stop
  app service status
  app sop list
  app sop validate
  app sop run NAME
  app plugin list
  app plugin install PATH

SOP engine, optional but valuable:

Implement deterministic workflows loaded from:

  ~/.appname/workspace/sops/<name>/SOP.toml

Minimum SOP format:

  name = "daily-check"
  description = "Run a daily workspace check"

  [[steps]]
  id = "list"
  kind = "tool"
  tool = "file_list"
  args = { path = "." }

  [[steps]]
  id = "summarize"
  kind = "agent"
  prompt = "Summarize the file list from the previous step."

  [[steps]]
  id = "approval"
  kind = "approval"
  prompt = "Continue to write report?"

  [[steps]]
  id = "write"
  kind = "tool"
  tool = "file_write"
  args = { path = "daily-check.txt", content_from = "summarize" }

Requirements:

  - validate step IDs are unique
  - validate referenced tools exist
  - persist SOP run state
  - stop at approval steps until approved
  - support on_failure = "abort"
  - support on_failure = "continue"

Plugin system, stretch goal:

A plugin is a directory:

  plugin-name/
    manifest.toml
    executable-or-script

Minimum manifest:

  name = "echo-plugin"
  version = "0.1.0"
  capabilities = ["tool"]

  [[tools]]
  name = "echo"
  description = "Echoes input"
  command = "./echo-plugin"
  schema = { type = "object" }

The runtime discovers plugins under:

  ~/.appname/plugins/

Simpler acceptable version:

Support external process tools where the runtime invokes a configured executable with JSON on stdin and reads JSON from stdout.

Observability:

Minimum logging:

  - human-readable logs to stderr
  - structured JSON logs when APPNAME_LOG=json, adjusted to the executable name
  - never log secrets

Log events:

  - startup
  - config path
  - workspace path
  - provider selected
  - channel started
  - conversation started
  - tool requested
  - tool approved
  - tool denied
  - tool completed
  - tool failed
  - receipt written
  - memory persisted
  - estop triggered

Optional metrics endpoint:

  GET /metrics

Expose counters if the endpoint is implemented:

  app_conversations_total
  app_tool_calls_total
  app_tool_denials_total
  app_provider_errors_total
  app_receipt_chain_valid

Emergency stop:

  app estop

Creates:

  ~/.appname/ESTOP

When this file exists:

  - no new tool calls may run
  - existing long-running shell/http tasks should be cancelled if possible
  - the agent may still answer text-only messages explaining that tool use is stopped

  app estop --clear

Removes the file.

Acceptance tests:

Test 1: init creates expected files.

  Given no ~/.appname directory
  When app init runs
  Then ~/.appname/config file exists
  And memory database or memory JSONL exists
  And workspace_dir exists

Test 2: config validation catches invalid autonomy.

  Given autonomy = "godmode"
  When app config validate runs
  Then exit code is nonzero
  And output mentions allowed values

Test 3: mock provider text-only response.

  Given mock provider fixture returns "hello"
  When app agent -m "hi" runs
  Then stdout contains "hello"
  And memory contains the user and assistant turn

Test 4: model-triggered file_list tool.

  Given mock provider fixture emits tool_call file_list { path = "." }
  When app agent -m "list files" runs
  Then file_list executes inside workspace
  And a tool receipt is written
  And final answer includes the file list summary

Test 5: workspace escape blocked.

  Given workspace_only = true
  When model requests file_read { path = "/etc/passwd" }
  Then tool is denied
  And a denied receipt is written
  And the provider receives a tool error

Test 6: supervised approval.

  Given autonomy = "supervised"
  When model requests file_write
  Then CLI asks for approval
  And default empty answer denies
  And "y" approves

Test 7: forbidden command blocked.

  When model requests shell { command = "rm -rf /" }
  Then tool is blocked before execution
  And receipt status is denied

Test 8: receipt chain detects tampering.

  Given three receipts exist
  When the second receipt is edited manually
  Then app receipt verify reports invalid chain at receipt 2

Test 9: provider fallback, if reliable provider is implemented.

  Given reliable provider = [bad_provider, mock_provider]
  And bad_provider times out
  When agent runs
  Then runtime logs fallback
  And response comes from mock_provider

Test 10: memory search.

  Given a previous conversation contains "Aardvark adapter"
  When app memory search "aardvark" runs
  Then the previous conversation ID is returned

Implementation priorities:

First produce a working vertical slice:

  1. application naming
  2. init
  3. config loading and validation
  4. mock provider
  5. CLI one-shot agent mode
  6. tools: time, file_list, file_read
  7. security policy for workspace paths
  8. memory persistence
  9. receipt writing and verification
  10. tests

Then add:

  11. interactive REPL
  12. file_write with approval
  13. shell with blocking rules
  14. HTTP GET tool
  15. OpenAI-compatible provider
  16. optional gateway
  17. optional SOP engine
  18. optional external-process plugins

Quality requirements:

  - Keep the implementation idiomatic for LANGUAGE.
  - Do not quietly substitute another implementation language.
  - Do not use Python, JavaScript, Rust, or C as the primary implementation language.
  - Shell scripts are acceptable only for setup convenience.
  - Prefer simple, boring dependencies.
  - Write tests for denied actions, not just successful actions.
  - Keep secrets out of logs.
  - Keep workspace path handling strict and well-tested.
  - Use deterministic mock fixtures so tests do not require network access.
  - Update README.md with architecture, config, security policy, commands, and test instructions.

Do not stop after creating stubs. Implement the core behavior. If a feature is not practical in LANGUAGE, document the limitation and implement the closest useful equivalent.