You are now implementing the real application in this workspace. The target implementation language is LANGUAGE. This workspace should already contain a verified LANGUAGE project skeleton with basic support for CLI, HTTP, SQLite, config parsing, tests, and build/test commands. Begin by inspecting the existing workspace and README before changing anything. Your first task is to name the application. The placeholder name is: *Claw Replace the wildcard with a short, distinctive prefix suitable for this implementation. Do not use an existing project’s name or branding. Once you choose the name, use it consistently for: - executable name - README title - config directory - default workspace directory - log names - test names where appropriate If the selected name is not suitable for an executable on macOS, create a lowercase/kebab-case executable form and document the relationship. For example: Application name: LogicClaw Executable: logicclaw The goal is to build a small, local-first agent runtime. It should run as a single command-line application that can load configuration, talk to a model provider, expose a CLI agent loop, execute a small set of tools through a security gate, persist memory, and write tamper-evident tool receipts. Do not mention or depend on any external agent runtime project. Treat this as a clean-room implementation from this spec. Core workflow: app init app config validate app config show app provider list app provider test NAME app tool list app tool run NAME --json ARGS app agent app agent -m "What files are in this project?" app memory search "previous topic" app receipt verify app estop Use the final executable name you selected instead of `app`. The agent must: - accept input from the CLI channel - send the conversation to a configured model provider - advertise available tools to the model - parse tool calls from the model response - validate each tool call through a security policy - execute approved tools - feed tool results back into the model - persist the final exchange, tool calls, tool results, and receipts - return a final answer to the user Required architecture: The implementation must have visible separation of responsibility for these areas: runtime agent loop, request lifecycle, orchestration config config loading, validation, defaults, path expansion providers model provider abstraction and concrete providers channels CLI channel and optional HTTP/gateway channel tools time, file_list, file_read, file_write, shell, http, memory_search security autonomy levels, command/path policy, tool-risk classification memory SQLite persistence, or JSONL only if SQLite is impractical in LANGUAGE receipts tamper-evident tool-call receipts sop optional deterministic workflow runner service optional install/start/stop/status wrappers Do not force object-oriented structure if LANGUAGE is not object-oriented. Use idiomatic LANGUAGE design, but preserve the conceptual boundaries. Configuration: The application must use a user-editable config file. Default location should be based on the final app name, for example: ~/.logicclaw/config.toml TOML is preferred. JSON, INI, S-expression, or another idiomatic config format is acceptable if TOML support is weak in LANGUAGE. If not using TOML, document why. Minimum config shape: workspace_dir = "~/logicclaw-workspace" default_provider = "local" default_model = "mock" [security] autonomy = "supervised" workspace_only = true forbidden_paths = ["/etc", "/sys", "/boot", "~/.ssh"] forbidden_commands = ["rm", "shutdown", "reboot", "mkfs", "dd"] audit_log = true [providers.models.local] kind = "mock" model = "mock" [providers.models.openai_compatible] kind = "openai-compatible" base_url = "http://localhost:1234/v1" model = "local-model" api_key_env = "OPENAI_API_KEY" [channels.cli] enabled = true tools_allow = ["file_read", "file_list", "time", "memory_search", "shell"] [memory] backend = "sqlite" path = "~/.logicclaw/memory.sqlite" [receipts] enabled = true path = "~/.logicclaw/tool_receipts.log" Adjust paths to match the application name you chose. Config requirements: - load defaults when keys are absent - expand ~ and environment variables - validate enum values - validate that workspace exists or create it during init - do not require API keys for mock mode - support provider credentials by environment variable - never print secret values in logs or config dumps - config validate must report all detected errors in one pass when practical Provider abstraction: Create an idiomatic equivalent of: Provider name() -> string capabilities() -> ProviderCapabilities chat(request: ChatRequest) -> ChatResponse ChatRequest must contain: - system_prompt - messages - tools - model - optional temperature - optional metadata ChatResponse must contain: - final_text - tool_calls - optional raw_provider_payload - optional usage Required providers: mock Deterministic provider used for tests. It must be able to return ordinary text and tool calls from scripted fixtures. openai-compatible Sends requests to an OpenAI-compatible /chat/completions endpoint. Full support for every provider is not required. Implement non-streaming chat completion. Tool/function call support is required if reasonably practical in LANGUAGE; otherwise document the limitation clearly. Optional providers: reliable Wrapper provider that tries provider names in order and falls back on network/auth/timeout errors. router Wrapper provider that chooses a provider from request metadata hints. Channel abstraction: Create an idiomatic equivalent of: Channel name() -> string start(runtime_handle) send(conversation_id, message) supports_draft_updates() -> bool Required channel: cli CLI behavior: app agent starts a REPL app agent -m "message" runs one turn and exits REPL commands: /exit exits /tools lists active tools /memory searches memory /policy prints current autonomy and workspace boundary Optional gateway channel: localhost HTTP server Minimum optional gateway endpoints: GET /health GET /status GET /tools POST /chat GET /memory/search?q=... GET /receipts POST /estop Tool abstraction: Create an idiomatic equivalent of: Tool name() -> string description() -> string parameters_schema() -> JSON Schema object or equivalent metadata risk(args, context) -> low | medium | high invoke(args, context) -> ToolResult ToolResult must contain: - success: bool - output: string - optional error: string - optional metadata - optional receipt_id Required built-in tools: time Returns current local time, UTC time, and timezone if available. file_list Lists files under a path inside workspace. file_read Reads a UTF-8 text file inside workspace. file_write Writes a UTF-8 text file inside workspace. shell Executes a shell command inside workspace, subject to security policy. http Performs HTTP GET. POST is optional. memory_search Searches persisted conversations. Optional tools: web_search May be stubbed unless a search API key is configured. pdf_extract Optional. ask_user In CLI mode, asks the user a question and returns the answer. Security model: Implement three autonomy levels: readonly Low-risk read-only tools allowed. No file_write. No shell execution except optionally harmless commands such as pwd. supervised Low-risk tools run automatically. Medium-risk tools require operator approval. High-risk tools are blocked. full Low and medium run automatically. High-risk is still blocked if explicitly forbidden by path or command policy. Default must be: supervised Risk rules: time, memory_search, file_list, file_read inside workspace: low http GET to allowed domains: low file_write inside workspace: medium shell command from allowlist: medium shell command not on allowlist: high any path outside workspace when workspace_only = true: blocked any path under forbidden_paths: blocked any command whose basename appears in forbidden_commands: blocked Any shell command containing obvious destructive patterns must be blocked. Minimum patterns: rm -rf / rm -rf * mkfs dd if= :(){ :|:& };: shutdown reboot chmod -R 777 / chown -R curl ... | sh wget ... | sh Approval flow in CLI mode: When a medium-risk action requires approval, print something like: Tool request: tool: file_write risk: medium reason: writes to workspace args: ... Approve? [y/N] Default is deny. Tool receipts: Every attempted tool invocation must produce a receipt whether it is allowed, denied, failed, or approved. Receipt fields: { "id": "receipt-...", "timestamp": "2026-05-12T14:00:00Z", "conversation_id": "...", "tool": "file_read", "args_hash": "...", "result_hash": "...", "status": "allowed|denied|failed", "risk": "low|medium|high", "previous_hash": "...", "receipt_hash": "..." } Receipt hash: receipt_hash = SHA256(canonical_json(receipt_without_receipt_hash)) Tamper-evident chain: - each receipt includes the previous receipt’s hash - receipt verify must replay the log - it must report the first broken link Optional stronger version: HMAC-SHA256 with a locally stored secret key Memory: Use SQLite if practical in LANGUAGE. Use JSONL only if SQLite support is impractical or broken. Persist: - conversation_id - turn_id - timestamp - role - content - tool_calls - tool_results - provider - model - metadata Required commands: app memory search QUERY app memory show CONVERSATION_ID app memory list app memory clear --yes Search may be simple substring search. Optional scoring: - tokenize query and content - rank by term frequency - boost recent conversations Agent loop: Implement this loop: 1. Receive user message from channel. 2. Create or resume conversation. 3. Load recent memory context. 4. Build system prompt. 5. Build tool schemas from active tools. 6. Call provider. 7. If provider returns text only, persist and reply. 8. If provider returns tool calls: a. For each tool call, classify risk. b. Validate policy. c. Ask approval when required. d. Invoke or deny. e. Write receipt. f. Persist tool call and result. 9. Send tool results back to provider. 10. Repeat until final text or max_tool_rounds is reached. 11. Persist final assistant response. 12. Reply to channel. Guardrails: max_tool_rounds default: 5 max_response_bytes default: 1 MB tool execution timeout default: 30 seconds shell timeout default: 15 seconds HTTP timeout default: 20 seconds The runtime must not recursively invoke tools forever. Required CLI command surface: app init app onboard app config validate app config show app provider list app provider test NAME app tool list app tool run NAME --json ARGS app agent app agent -m MESSAGE app memory list app memory search QUERY app memory show CONVERSATION_ID app receipt list app receipt verify app estop Optional commands: app service install app service start app service stop app service status app sop list app sop validate app sop run NAME app plugin list app plugin install PATH SOP engine, optional but valuable: Implement deterministic workflows loaded from: ~/.appname/workspace/sops//SOP.toml Minimum SOP format: name = "daily-check" description = "Run a daily workspace check" [[steps]] id = "list" kind = "tool" tool = "file_list" args = { path = "." } [[steps]] id = "summarize" kind = "agent" prompt = "Summarize the file list from the previous step." [[steps]] id = "approval" kind = "approval" prompt = "Continue to write report?" [[steps]] id = "write" kind = "tool" tool = "file_write" args = { path = "daily-check.txt", content_from = "summarize" } Requirements: - validate step IDs are unique - validate referenced tools exist - persist SOP run state - stop at approval steps until approved - support on_failure = "abort" - support on_failure = "continue" Plugin system, stretch goal: A plugin is a directory: plugin-name/ manifest.toml executable-or-script Minimum manifest: name = "echo-plugin" version = "0.1.0" capabilities = ["tool"] [[tools]] name = "echo" description = "Echoes input" command = "./echo-plugin" schema = { type = "object" } The runtime discovers plugins under: ~/.appname/plugins/ Simpler acceptable version: Support external process tools where the runtime invokes a configured executable with JSON on stdin and reads JSON from stdout. Observability: Minimum logging: - human-readable logs to stderr - structured JSON logs when APPNAME_LOG=json, adjusted to the executable name - never log secrets Log events: - startup - config path - workspace path - provider selected - channel started - conversation started - tool requested - tool approved - tool denied - tool completed - tool failed - receipt written - memory persisted - estop triggered Optional metrics endpoint: GET /metrics Expose counters if the endpoint is implemented: app_conversations_total app_tool_calls_total app_tool_denials_total app_provider_errors_total app_receipt_chain_valid Emergency stop: app estop Creates: ~/.appname/ESTOP When this file exists: - no new tool calls may run - existing long-running shell/http tasks should be cancelled if possible - the agent may still answer text-only messages explaining that tool use is stopped app estop --clear Removes the file. Acceptance tests: Test 1: init creates expected files. Given no ~/.appname directory When app init runs Then ~/.appname/config file exists And memory database or memory JSONL exists And workspace_dir exists Test 2: config validation catches invalid autonomy. Given autonomy = "godmode" When app config validate runs Then exit code is nonzero And output mentions allowed values Test 3: mock provider text-only response. Given mock provider fixture returns "hello" When app agent -m "hi" runs Then stdout contains "hello" And memory contains the user and assistant turn Test 4: model-triggered file_list tool. Given mock provider fixture emits tool_call file_list { path = "." } When app agent -m "list files" runs Then file_list executes inside workspace And a tool receipt is written And final answer includes the file list summary Test 5: workspace escape blocked. Given workspace_only = true When model requests file_read { path = "/etc/passwd" } Then tool is denied And a denied receipt is written And the provider receives a tool error Test 6: supervised approval. Given autonomy = "supervised" When model requests file_write Then CLI asks for approval And default empty answer denies And "y" approves Test 7: forbidden command blocked. When model requests shell { command = "rm -rf /" } Then tool is blocked before execution And receipt status is denied Test 8: receipt chain detects tampering. Given three receipts exist When the second receipt is edited manually Then app receipt verify reports invalid chain at receipt 2 Test 9: provider fallback, if reliable provider is implemented. Given reliable provider = [bad_provider, mock_provider] And bad_provider times out When agent runs Then runtime logs fallback And response comes from mock_provider Test 10: memory search. Given a previous conversation contains "Aardvark adapter" When app memory search "aardvark" runs Then the previous conversation ID is returned Implementation priorities: First produce a working vertical slice: 1. application naming 2. init 3. config loading and validation 4. mock provider 5. CLI one-shot agent mode 6. tools: time, file_list, file_read 7. security policy for workspace paths 8. memory persistence 9. receipt writing and verification 10. tests Then add: 11. interactive REPL 12. file_write with approval 13. shell with blocking rules 14. HTTP GET tool 15. OpenAI-compatible provider 16. optional gateway 17. optional SOP engine 18. optional external-process plugins Quality requirements: - Keep the implementation idiomatic for LANGUAGE. - Do not quietly substitute another implementation language. - Do not use Python, JavaScript, Rust, or C as the primary implementation language. - Shell scripts are acceptable only for setup convenience. - Prefer simple, boring dependencies. - Write tests for denied actions, not just successful actions. - Keep secrets out of logs. - Keep workspace path handling strict and well-tested. - Use deterministic mock fixtures so tests do not require network access. - Update README.md with architecture, config, security policy, commands, and test instructions. Do not stop after creating stubs. Implement the core behavior. If a feature is not practical in LANGUAGE, document the limitation and implement the closest useful equivalent.