Last active 3 weeks ago

ClawClone.prompt Raw
1You are now implementing the real application in this workspace.
2
3The target implementation language is LANGUAGE.
4
5This workspace should already contain a verified LANGUAGE project skeleton with basic support for CLI, HTTP, SQLite, config parsing, tests, and build/test commands. Begin by inspecting the existing workspace and README before changing anything.
6
7Your first task is to name the application.
8
9The placeholder name is:
10
11 *Claw
12
13Replace the wildcard with a short, distinctive prefix suitable for this implementation.
14
15Do not use an existing project’s name or branding. Once you choose the name, use it consistently for:
16
17 - executable name
18 - README title
19 - config directory
20 - default workspace directory
21 - log names
22 - test names where appropriate
23
24If the selected name is not suitable for an executable on macOS, create a lowercase/kebab-case executable form and document the relationship. For example:
25
26 Application name: LogicClaw
27 Executable: logicclaw
28
29The goal is to build a small, local-first agent runtime. It should run as a single command-line application that can load configuration, talk to a model provider, expose a CLI agent loop, execute a small set of tools through a security gate, persist memory, and write tamper-evident tool receipts.
30
31Do not mention or depend on any external agent runtime project. Treat this as a clean-room implementation from this spec.
32
33Core workflow:
34
35 app init
36 app config validate
37 app config show
38 app provider list
39 app provider test NAME
40 app tool list
41 app tool run NAME --json ARGS
42 app agent
43 app agent -m "What files are in this project?"
44 app memory search "previous topic"
45 app receipt verify
46 app estop
47
48Use the final executable name you selected instead of `app`.
49
50The agent must:
51
52 - accept input from the CLI channel
53 - send the conversation to a configured model provider
54 - advertise available tools to the model
55 - parse tool calls from the model response
56 - validate each tool call through a security policy
57 - execute approved tools
58 - feed tool results back into the model
59 - persist the final exchange, tool calls, tool results, and receipts
60 - return a final answer to the user
61
62Required architecture:
63
64The implementation must have visible separation of responsibility for these areas:
65
66 runtime
67 agent loop, request lifecycle, orchestration
68
69 config
70 config loading, validation, defaults, path expansion
71
72 providers
73 model provider abstraction and concrete providers
74
75 channels
76 CLI channel and optional HTTP/gateway channel
77
78 tools
79 time, file_list, file_read, file_write, shell, http, memory_search
80
81 security
82 autonomy levels, command/path policy, tool-risk classification
83
84 memory
85 SQLite persistence, or JSONL only if SQLite is impractical in LANGUAGE
86
87 receipts
88 tamper-evident tool-call receipts
89
90 sop
91 optional deterministic workflow runner
92
93 service
94 optional install/start/stop/status wrappers
95
96Do not force object-oriented structure if LANGUAGE is not object-oriented. Use idiomatic LANGUAGE design, but preserve the conceptual boundaries.
97
98Configuration:
99
100The application must use a user-editable config file.
101
102Default location should be based on the final app name, for example:
103
104 ~/.logicclaw/config.toml
105
106TOML is preferred. JSON, INI, S-expression, or another idiomatic config format is acceptable if TOML support is weak in LANGUAGE. If not using TOML, document why.
107
108Minimum config shape:
109
110 workspace_dir = "~/logicclaw-workspace"
111 default_provider = "local"
112 default_model = "mock"
113
114 [security]
115 autonomy = "supervised"
116 workspace_only = true
117 forbidden_paths = ["/etc", "/sys", "/boot", "~/.ssh"]
118 forbidden_commands = ["rm", "shutdown", "reboot", "mkfs", "dd"]
119 audit_log = true
120
121 [providers.models.local]
122 kind = "mock"
123 model = "mock"
124
125 [providers.models.openai_compatible]
126 kind = "openai-compatible"
127 base_url = "http://localhost:1234/v1"
128 model = "local-model"
129 api_key_env = "OPENAI_API_KEY"
130
131 [channels.cli]
132 enabled = true
133 tools_allow = ["file_read", "file_list", "time", "memory_search", "shell"]
134
135 [memory]
136 backend = "sqlite"
137 path = "~/.logicclaw/memory.sqlite"
138
139 [receipts]
140 enabled = true
141 path = "~/.logicclaw/tool_receipts.log"
142
143Adjust paths to match the application name you chose.
144
145Config requirements:
146
147 - load defaults when keys are absent
148 - expand ~ and environment variables
149 - validate enum values
150 - validate that workspace exists or create it during init
151 - do not require API keys for mock mode
152 - support provider credentials by environment variable
153 - never print secret values in logs or config dumps
154 - config validate must report all detected errors in one pass when practical
155
156Provider abstraction:
157
158Create an idiomatic equivalent of:
159
160 Provider
161 name() -> string
162 capabilities() -> ProviderCapabilities
163 chat(request: ChatRequest) -> ChatResponse
164
165ChatRequest must contain:
166
167 - system_prompt
168 - messages
169 - tools
170 - model
171 - optional temperature
172 - optional metadata
173
174ChatResponse must contain:
175
176 - final_text
177 - tool_calls
178 - optional raw_provider_payload
179 - optional usage
180
181Required providers:
182
183 mock
184 Deterministic provider used for tests. It must be able to return ordinary text and tool calls from scripted fixtures.
185
186 openai-compatible
187 Sends requests to an OpenAI-compatible /chat/completions endpoint. Full support for every provider is not required. Implement non-streaming chat completion. Tool/function call support is required if reasonably practical in LANGUAGE; otherwise document the limitation clearly.
188
189Optional providers:
190
191 reliable
192 Wrapper provider that tries provider names in order and falls back on network/auth/timeout errors.
193
194 router
195 Wrapper provider that chooses a provider from request metadata hints.
196
197Channel abstraction:
198
199Create an idiomatic equivalent of:
200
201 Channel
202 name() -> string
203 start(runtime_handle)
204 send(conversation_id, message)
205 supports_draft_updates() -> bool
206
207Required channel:
208
209 cli
210
211CLI behavior:
212
213 app agent
214 starts a REPL
215
216 app agent -m "message"
217 runs one turn and exits
218
219REPL commands:
220
221 /exit
222 exits
223
224 /tools
225 lists active tools
226
227 /memory <query>
228 searches memory
229
230 /policy
231 prints current autonomy and workspace boundary
232
233Optional gateway channel:
234
235 localhost HTTP server
236
237Minimum optional gateway endpoints:
238
239 GET /health
240 GET /status
241 GET /tools
242 POST /chat
243 GET /memory/search?q=...
244 GET /receipts
245 POST /estop
246
247Tool abstraction:
248
249Create an idiomatic equivalent of:
250
251 Tool
252 name() -> string
253 description() -> string
254 parameters_schema() -> JSON Schema object or equivalent metadata
255 risk(args, context) -> low | medium | high
256 invoke(args, context) -> ToolResult
257
258ToolResult must contain:
259
260 - success: bool
261 - output: string
262 - optional error: string
263 - optional metadata
264 - optional receipt_id
265
266Required built-in tools:
267
268 time
269 Returns current local time, UTC time, and timezone if available.
270
271 file_list
272 Lists files under a path inside workspace.
273
274 file_read
275 Reads a UTF-8 text file inside workspace.
276
277 file_write
278 Writes a UTF-8 text file inside workspace.
279
280 shell
281 Executes a shell command inside workspace, subject to security policy.
282
283 http
284 Performs HTTP GET. POST is optional.
285
286 memory_search
287 Searches persisted conversations.
288
289Optional tools:
290
291 web_search
292 May be stubbed unless a search API key is configured.
293
294 pdf_extract
295 Optional.
296
297 ask_user
298 In CLI mode, asks the user a question and returns the answer.
299
300Security model:
301
302Implement three autonomy levels:
303
304 readonly
305 Low-risk read-only tools allowed.
306 No file_write.
307 No shell execution except optionally harmless commands such as pwd.
308
309 supervised
310 Low-risk tools run automatically.
311 Medium-risk tools require operator approval.
312 High-risk tools are blocked.
313
314 full
315 Low and medium run automatically.
316 High-risk is still blocked if explicitly forbidden by path or command policy.
317
318Default must be:
319
320 supervised
321
322Risk rules:
323
324 time, memory_search, file_list, file_read inside workspace:
325 low
326
327 http GET to allowed domains:
328 low
329
330 file_write inside workspace:
331 medium
332
333 shell command from allowlist:
334 medium
335
336 shell command not on allowlist:
337 high
338
339 any path outside workspace when workspace_only = true:
340 blocked
341
342 any path under forbidden_paths:
343 blocked
344
345 any command whose basename appears in forbidden_commands:
346 blocked
347
348Any shell command containing obvious destructive patterns must be blocked. Minimum patterns:
349
350 rm -rf /
351 rm -rf *
352 mkfs
353 dd if=
354 :(){ :|:& };:
355 shutdown
356 reboot
357 chmod -R 777 /
358 chown -R
359 curl ... | sh
360 wget ... | sh
361
362Approval flow in CLI mode:
363
364When a medium-risk action requires approval, print something like:
365
366 Tool request:
367 tool: file_write
368 risk: medium
369 reason: writes to workspace
370 args: ...
371 Approve? [y/N]
372
373Default is deny.
374
375Tool receipts:
376
377Every attempted tool invocation must produce a receipt whether it is allowed, denied, failed, or approved.
378
379Receipt fields:
380
381 {
382 "id": "receipt-...",
383 "timestamp": "2026-05-12T14:00:00Z",
384 "conversation_id": "...",
385 "tool": "file_read",
386 "args_hash": "...",
387 "result_hash": "...",
388 "status": "allowed|denied|failed",
389 "risk": "low|medium|high",
390 "previous_hash": "...",
391 "receipt_hash": "..."
392 }
393
394Receipt hash:
395
396 receipt_hash = SHA256(canonical_json(receipt_without_receipt_hash))
397
398Tamper-evident chain:
399
400 - each receipt includes the previous receipt’s hash
401 - receipt verify must replay the log
402 - it must report the first broken link
403
404Optional stronger version:
405
406 HMAC-SHA256 with a locally stored secret key
407
408Memory:
409
410Use SQLite if practical in LANGUAGE. Use JSONL only if SQLite support is impractical or broken.
411
412Persist:
413
414 - conversation_id
415 - turn_id
416 - timestamp
417 - role
418 - content
419 - tool_calls
420 - tool_results
421 - provider
422 - model
423 - metadata
424
425Required commands:
426
427 app memory search QUERY
428 app memory show CONVERSATION_ID
429 app memory list
430 app memory clear --yes
431
432Search may be simple substring search.
433
434Optional scoring:
435
436 - tokenize query and content
437 - rank by term frequency
438 - boost recent conversations
439
440Agent loop:
441
442Implement this loop:
443
444 1. Receive user message from channel.
445 2. Create or resume conversation.
446 3. Load recent memory context.
447 4. Build system prompt.
448 5. Build tool schemas from active tools.
449 6. Call provider.
450 7. If provider returns text only, persist and reply.
451 8. If provider returns tool calls:
452 a. For each tool call, classify risk.
453 b. Validate policy.
454 c. Ask approval when required.
455 d. Invoke or deny.
456 e. Write receipt.
457 f. Persist tool call and result.
458 9. Send tool results back to provider.
459 10. Repeat until final text or max_tool_rounds is reached.
460 11. Persist final assistant response.
461 12. Reply to channel.
462
463Guardrails:
464
465 max_tool_rounds default:
466 5
467
468 max_response_bytes default:
469 1 MB
470
471 tool execution timeout default:
472 30 seconds
473
474 shell timeout default:
475 15 seconds
476
477 HTTP timeout default:
478 20 seconds
479
480The runtime must not recursively invoke tools forever.
481
482Required CLI command surface:
483
484 app init
485 app onboard
486 app config validate
487 app config show
488 app provider list
489 app provider test NAME
490 app tool list
491 app tool run NAME --json ARGS
492 app agent
493 app agent -m MESSAGE
494 app memory list
495 app memory search QUERY
496 app memory show CONVERSATION_ID
497 app receipt list
498 app receipt verify
499 app estop
500
501Optional commands:
502
503 app service install
504 app service start
505 app service stop
506 app service status
507 app sop list
508 app sop validate
509 app sop run NAME
510 app plugin list
511 app plugin install PATH
512
513SOP engine, optional but valuable:
514
515Implement deterministic workflows loaded from:
516
517 ~/.appname/workspace/sops/<name>/SOP.toml
518
519Minimum SOP format:
520
521 name = "daily-check"
522 description = "Run a daily workspace check"
523
524 [[steps]]
525 id = "list"
526 kind = "tool"
527 tool = "file_list"
528 args = { path = "." }
529
530 [[steps]]
531 id = "summarize"
532 kind = "agent"
533 prompt = "Summarize the file list from the previous step."
534
535 [[steps]]
536 id = "approval"
537 kind = "approval"
538 prompt = "Continue to write report?"
539
540 [[steps]]
541 id = "write"
542 kind = "tool"
543 tool = "file_write"
544 args = { path = "daily-check.txt", content_from = "summarize" }
545
546Requirements:
547
548 - validate step IDs are unique
549 - validate referenced tools exist
550 - persist SOP run state
551 - stop at approval steps until approved
552 - support on_failure = "abort"
553 - support on_failure = "continue"
554
555Plugin system, stretch goal:
556
557A plugin is a directory:
558
559 plugin-name/
560 manifest.toml
561 executable-or-script
562
563Minimum manifest:
564
565 name = "echo-plugin"
566 version = "0.1.0"
567 capabilities = ["tool"]
568
569 [[tools]]
570 name = "echo"
571 description = "Echoes input"
572 command = "./echo-plugin"
573 schema = { type = "object" }
574
575The runtime discovers plugins under:
576
577 ~/.appname/plugins/
578
579Simpler acceptable version:
580
581Support external process tools where the runtime invokes a configured executable with JSON on stdin and reads JSON from stdout.
582
583Observability:
584
585Minimum logging:
586
587 - human-readable logs to stderr
588 - structured JSON logs when APPNAME_LOG=json, adjusted to the executable name
589 - never log secrets
590
591Log events:
592
593 - startup
594 - config path
595 - workspace path
596 - provider selected
597 - channel started
598 - conversation started
599 - tool requested
600 - tool approved
601 - tool denied
602 - tool completed
603 - tool failed
604 - receipt written
605 - memory persisted
606 - estop triggered
607
608Optional metrics endpoint:
609
610 GET /metrics
611
612Expose counters if the endpoint is implemented:
613
614 app_conversations_total
615 app_tool_calls_total
616 app_tool_denials_total
617 app_provider_errors_total
618 app_receipt_chain_valid
619
620Emergency stop:
621
622 app estop
623
624Creates:
625
626 ~/.appname/ESTOP
627
628When this file exists:
629
630 - no new tool calls may run
631 - existing long-running shell/http tasks should be cancelled if possible
632 - the agent may still answer text-only messages explaining that tool use is stopped
633
634 app estop --clear
635
636Removes the file.
637
638Acceptance tests:
639
640Test 1: init creates expected files.
641
642 Given no ~/.appname directory
643 When app init runs
644 Then ~/.appname/config file exists
645 And memory database or memory JSONL exists
646 And workspace_dir exists
647
648Test 2: config validation catches invalid autonomy.
649
650 Given autonomy = "godmode"
651 When app config validate runs
652 Then exit code is nonzero
653 And output mentions allowed values
654
655Test 3: mock provider text-only response.
656
657 Given mock provider fixture returns "hello"
658 When app agent -m "hi" runs
659 Then stdout contains "hello"
660 And memory contains the user and assistant turn
661
662Test 4: model-triggered file_list tool.
663
664 Given mock provider fixture emits tool_call file_list { path = "." }
665 When app agent -m "list files" runs
666 Then file_list executes inside workspace
667 And a tool receipt is written
668 And final answer includes the file list summary
669
670Test 5: workspace escape blocked.
671
672 Given workspace_only = true
673 When model requests file_read { path = "/etc/passwd" }
674 Then tool is denied
675 And a denied receipt is written
676 And the provider receives a tool error
677
678Test 6: supervised approval.
679
680 Given autonomy = "supervised"
681 When model requests file_write
682 Then CLI asks for approval
683 And default empty answer denies
684 And "y" approves
685
686Test 7: forbidden command blocked.
687
688 When model requests shell { command = "rm -rf /" }
689 Then tool is blocked before execution
690 And receipt status is denied
691
692Test 8: receipt chain detects tampering.
693
694 Given three receipts exist
695 When the second receipt is edited manually
696 Then app receipt verify reports invalid chain at receipt 2
697
698Test 9: provider fallback, if reliable provider is implemented.
699
700 Given reliable provider = [bad_provider, mock_provider]
701 And bad_provider times out
702 When agent runs
703 Then runtime logs fallback
704 And response comes from mock_provider
705
706Test 10: memory search.
707
708 Given a previous conversation contains "Aardvark adapter"
709 When app memory search "aardvark" runs
710 Then the previous conversation ID is returned
711
712Implementation priorities:
713
714First produce a working vertical slice:
715
716 1. application naming
717 2. init
718 3. config loading and validation
719 4. mock provider
720 5. CLI one-shot agent mode
721 6. tools: time, file_list, file_read
722 7. security policy for workspace paths
723 8. memory persistence
724 9. receipt writing and verification
725 10. tests
726
727Then add:
728
729 11. interactive REPL
730 12. file_write with approval
731 13. shell with blocking rules
732 14. HTTP GET tool
733 15. OpenAI-compatible provider
734 16. optional gateway
735 17. optional SOP engine
736 18. optional external-process plugins
737
738Quality requirements:
739
740 - Keep the implementation idiomatic for LANGUAGE.
741 - Do not quietly substitute another implementation language.
742 - Do not use Python, JavaScript, Rust, or C as the primary implementation language.
743 - Shell scripts are acceptable only for setup convenience.
744 - Prefer simple, boring dependencies.
745 - Write tests for denied actions, not just successful actions.
746 - Keep secrets out of logs.
747 - Keep workspace path handling strict and well-tested.
748 - Use deterministic mock fixtures so tests do not require network access.
749 - Update README.md with architecture, config, security policy, commands, and test instructions.
750
751Do not stop after creating stubs. Implement the core behavior. If a feature is not practical in LANGUAGE, document the limitation and implement the closest useful equivalent.