Last active 3 weeks ago

cadwaladyr revised this gist 3 weeks ago. Go to revision

1 file changed, 1 insertion, 7 deletions

ClawClone.prompt

@@ -10,13 +10,7 @@ The placeholder name is:
10 10
11 11 *Claw
12 12
13 - Replace the wildcard with a short, distinctive prefix suitable for this LANGUAGE implementation. Examples of the naming style:
14 -
15 - LispClaw
16 - BeamClaw
17 - LogicClaw
18 - CrystalClaw
19 - CobolClaw
13 + Replace the wildcard with a short, distinctive prefix suitable for this implementation.
20 14
21 15 Do not use an existing project’s name or branding. Once you choose the name, use it consistently for:
22 16

cadwaladyr revised this gist 3 weeks ago. Go to revision

1 file changed, 757 insertions

ClawClone.prompt(file created)

@@ -0,0 +1,757 @@
1 + You are now implementing the real application in this workspace.
2 +
3 + The target implementation language is LANGUAGE.
4 +
5 + This workspace should already contain a verified LANGUAGE project skeleton with basic support for CLI, HTTP, SQLite, config parsing, tests, and build/test commands. Begin by inspecting the existing workspace and README before changing anything.
6 +
7 + Your first task is to name the application.
8 +
9 + The placeholder name is:
10 +
11 + *Claw
12 +
13 + Replace the wildcard with a short, distinctive prefix suitable for this LANGUAGE implementation. Examples of the naming style:
14 +
15 + LispClaw
16 + BeamClaw
17 + LogicClaw
18 + CrystalClaw
19 + CobolClaw
20 +
21 + Do not use an existing project’s name or branding. Once you choose the name, use it consistently for:
22 +
23 + - executable name
24 + - README title
25 + - config directory
26 + - default workspace directory
27 + - log names
28 + - test names where appropriate
29 +
30 + If the selected name is not suitable for an executable on macOS, create a lowercase/kebab-case executable form and document the relationship. For example:
31 +
32 + Application name: LogicClaw
33 + Executable: logicclaw
34 +
35 + The goal is to build a small, local-first agent runtime. It should run as a single command-line application that can load configuration, talk to a model provider, expose a CLI agent loop, execute a small set of tools through a security gate, persist memory, and write tamper-evident tool receipts.
36 +
37 + Do not mention or depend on any external agent runtime project. Treat this as a clean-room implementation from this spec.
38 +
39 + Core workflow:
40 +
41 + app init
42 + app config validate
43 + app config show
44 + app provider list
45 + app provider test NAME
46 + app tool list
47 + app tool run NAME --json ARGS
48 + app agent
49 + app agent -m "What files are in this project?"
50 + app memory search "previous topic"
51 + app receipt verify
52 + app estop
53 +
54 + Use the final executable name you selected instead of `app`.
55 +
56 + The agent must:
57 +
58 + - accept input from the CLI channel
59 + - send the conversation to a configured model provider
60 + - advertise available tools to the model
61 + - parse tool calls from the model response
62 + - validate each tool call through a security policy
63 + - execute approved tools
64 + - feed tool results back into the model
65 + - persist the final exchange, tool calls, tool results, and receipts
66 + - return a final answer to the user
67 +
68 + Required architecture:
69 +
70 + The implementation must have visible separation of responsibility for these areas:
71 +
72 + runtime
73 + agent loop, request lifecycle, orchestration
74 +
75 + config
76 + config loading, validation, defaults, path expansion
77 +
78 + providers
79 + model provider abstraction and concrete providers
80 +
81 + channels
82 + CLI channel and optional HTTP/gateway channel
83 +
84 + tools
85 + time, file_list, file_read, file_write, shell, http, memory_search
86 +
87 + security
88 + autonomy levels, command/path policy, tool-risk classification
89 +
90 + memory
91 + SQLite persistence, or JSONL only if SQLite is impractical in LANGUAGE
92 +
93 + receipts
94 + tamper-evident tool-call receipts
95 +
96 + sop
97 + optional deterministic workflow runner
98 +
99 + service
100 + optional install/start/stop/status wrappers
101 +
102 + Do not force object-oriented structure if LANGUAGE is not object-oriented. Use idiomatic LANGUAGE design, but preserve the conceptual boundaries.
103 +
104 + Configuration:
105 +
106 + The application must use a user-editable config file.
107 +
108 + Default location should be based on the final app name, for example:
109 +
110 + ~/.logicclaw/config.toml
111 +
112 + TOML is preferred. JSON, INI, S-expression, or another idiomatic config format is acceptable if TOML support is weak in LANGUAGE. If not using TOML, document why.
113 +
114 + Minimum config shape:
115 +
116 + workspace_dir = "~/logicclaw-workspace"
117 + default_provider = "local"
118 + default_model = "mock"
119 +
120 + [security]
121 + autonomy = "supervised"
122 + workspace_only = true
123 + forbidden_paths = ["/etc", "/sys", "/boot", "~/.ssh"]
124 + forbidden_commands = ["rm", "shutdown", "reboot", "mkfs", "dd"]
125 + audit_log = true
126 +
127 + [providers.models.local]
128 + kind = "mock"
129 + model = "mock"
130 +
131 + [providers.models.openai_compatible]
132 + kind = "openai-compatible"
133 + base_url = "http://localhost:1234/v1"
134 + model = "local-model"
135 + api_key_env = "OPENAI_API_KEY"
136 +
137 + [channels.cli]
138 + enabled = true
139 + tools_allow = ["file_read", "file_list", "time", "memory_search", "shell"]
140 +
141 + [memory]
142 + backend = "sqlite"
143 + path = "~/.logicclaw/memory.sqlite"
144 +
145 + [receipts]
146 + enabled = true
147 + path = "~/.logicclaw/tool_receipts.log"
148 +
149 + Adjust paths to match the application name you chose.
150 +
151 + Config requirements:
152 +
153 + - load defaults when keys are absent
154 + - expand ~ and environment variables
155 + - validate enum values
156 + - validate that workspace exists or create it during init
157 + - do not require API keys for mock mode
158 + - support provider credentials by environment variable
159 + - never print secret values in logs or config dumps
160 + - config validate must report all detected errors in one pass when practical
161 +
162 + Provider abstraction:
163 +
164 + Create an idiomatic equivalent of:
165 +
166 + Provider
167 + name() -> string
168 + capabilities() -> ProviderCapabilities
169 + chat(request: ChatRequest) -> ChatResponse
170 +
171 + ChatRequest must contain:
172 +
173 + - system_prompt
174 + - messages
175 + - tools
176 + - model
177 + - optional temperature
178 + - optional metadata
179 +
180 + ChatResponse must contain:
181 +
182 + - final_text
183 + - tool_calls
184 + - optional raw_provider_payload
185 + - optional usage
186 +
187 + Required providers:
188 +
189 + mock
190 + Deterministic provider used for tests. It must be able to return ordinary text and tool calls from scripted fixtures.
191 +
192 + openai-compatible
193 + Sends requests to an OpenAI-compatible /chat/completions endpoint. Full support for every provider is not required. Implement non-streaming chat completion. Tool/function call support is required if reasonably practical in LANGUAGE; otherwise document the limitation clearly.
194 +
195 + Optional providers:
196 +
197 + reliable
198 + Wrapper provider that tries provider names in order and falls back on network/auth/timeout errors.
199 +
200 + router
201 + Wrapper provider that chooses a provider from request metadata hints.
202 +
203 + Channel abstraction:
204 +
205 + Create an idiomatic equivalent of:
206 +
207 + Channel
208 + name() -> string
209 + start(runtime_handle)
210 + send(conversation_id, message)
211 + supports_draft_updates() -> bool
212 +
213 + Required channel:
214 +
215 + cli
216 +
217 + CLI behavior:
218 +
219 + app agent
220 + starts a REPL
221 +
222 + app agent -m "message"
223 + runs one turn and exits
224 +
225 + REPL commands:
226 +
227 + /exit
228 + exits
229 +
230 + /tools
231 + lists active tools
232 +
233 + /memory <query>
234 + searches memory
235 +
236 + /policy
237 + prints current autonomy and workspace boundary
238 +
239 + Optional gateway channel:
240 +
241 + localhost HTTP server
242 +
243 + Minimum optional gateway endpoints:
244 +
245 + GET /health
246 + GET /status
247 + GET /tools
248 + POST /chat
249 + GET /memory/search?q=...
250 + GET /receipts
251 + POST /estop
252 +
253 + Tool abstraction:
254 +
255 + Create an idiomatic equivalent of:
256 +
257 + Tool
258 + name() -> string
259 + description() -> string
260 + parameters_schema() -> JSON Schema object or equivalent metadata
261 + risk(args, context) -> low | medium | high
262 + invoke(args, context) -> ToolResult
263 +
264 + ToolResult must contain:
265 +
266 + - success: bool
267 + - output: string
268 + - optional error: string
269 + - optional metadata
270 + - optional receipt_id
271 +
272 + Required built-in tools:
273 +
274 + time
275 + Returns current local time, UTC time, and timezone if available.
276 +
277 + file_list
278 + Lists files under a path inside workspace.
279 +
280 + file_read
281 + Reads a UTF-8 text file inside workspace.
282 +
283 + file_write
284 + Writes a UTF-8 text file inside workspace.
285 +
286 + shell
287 + Executes a shell command inside workspace, subject to security policy.
288 +
289 + http
290 + Performs HTTP GET. POST is optional.
291 +
292 + memory_search
293 + Searches persisted conversations.
294 +
295 + Optional tools:
296 +
297 + web_search
298 + May be stubbed unless a search API key is configured.
299 +
300 + pdf_extract
301 + Optional.
302 +
303 + ask_user
304 + In CLI mode, asks the user a question and returns the answer.
305 +
306 + Security model:
307 +
308 + Implement three autonomy levels:
309 +
310 + readonly
311 + Low-risk read-only tools allowed.
312 + No file_write.
313 + No shell execution except optionally harmless commands such as pwd.
314 +
315 + supervised
316 + Low-risk tools run automatically.
317 + Medium-risk tools require operator approval.
318 + High-risk tools are blocked.
319 +
320 + full
321 + Low and medium run automatically.
322 + High-risk is still blocked if explicitly forbidden by path or command policy.
323 +
324 + Default must be:
325 +
326 + supervised
327 +
328 + Risk rules:
329 +
330 + time, memory_search, file_list, file_read inside workspace:
331 + low
332 +
333 + http GET to allowed domains:
334 + low
335 +
336 + file_write inside workspace:
337 + medium
338 +
339 + shell command from allowlist:
340 + medium
341 +
342 + shell command not on allowlist:
343 + high
344 +
345 + any path outside workspace when workspace_only = true:
346 + blocked
347 +
348 + any path under forbidden_paths:
349 + blocked
350 +
351 + any command whose basename appears in forbidden_commands:
352 + blocked
353 +
354 + Any shell command containing obvious destructive patterns must be blocked. Minimum patterns:
355 +
356 + rm -rf /
357 + rm -rf *
358 + mkfs
359 + dd if=
360 + :(){ :|:& };:
361 + shutdown
362 + reboot
363 + chmod -R 777 /
364 + chown -R
365 + curl ... | sh
366 + wget ... | sh
367 +
368 + Approval flow in CLI mode:
369 +
370 + When a medium-risk action requires approval, print something like:
371 +
372 + Tool request:
373 + tool: file_write
374 + risk: medium
375 + reason: writes to workspace
376 + args: ...
377 + Approve? [y/N]
378 +
379 + Default is deny.
380 +
381 + Tool receipts:
382 +
383 + Every attempted tool invocation must produce a receipt whether it is allowed, denied, failed, or approved.
384 +
385 + Receipt fields:
386 +
387 + {
388 + "id": "receipt-...",
389 + "timestamp": "2026-05-12T14:00:00Z",
390 + "conversation_id": "...",
391 + "tool": "file_read",
392 + "args_hash": "...",
393 + "result_hash": "...",
394 + "status": "allowed|denied|failed",
395 + "risk": "low|medium|high",
396 + "previous_hash": "...",
397 + "receipt_hash": "..."
398 + }
399 +
400 + Receipt hash:
401 +
402 + receipt_hash = SHA256(canonical_json(receipt_without_receipt_hash))
403 +
404 + Tamper-evident chain:
405 +
406 + - each receipt includes the previous receipt’s hash
407 + - receipt verify must replay the log
408 + - it must report the first broken link
409 +
410 + Optional stronger version:
411 +
412 + HMAC-SHA256 with a locally stored secret key
413 +
414 + Memory:
415 +
416 + Use SQLite if practical in LANGUAGE. Use JSONL only if SQLite support is impractical or broken.
417 +
418 + Persist:
419 +
420 + - conversation_id
421 + - turn_id
422 + - timestamp
423 + - role
424 + - content
425 + - tool_calls
426 + - tool_results
427 + - provider
428 + - model
429 + - metadata
430 +
431 + Required commands:
432 +
433 + app memory search QUERY
434 + app memory show CONVERSATION_ID
435 + app memory list
436 + app memory clear --yes
437 +
438 + Search may be simple substring search.
439 +
440 + Optional scoring:
441 +
442 + - tokenize query and content
443 + - rank by term frequency
444 + - boost recent conversations
445 +
446 + Agent loop:
447 +
448 + Implement this loop:
449 +
450 + 1. Receive user message from channel.
451 + 2. Create or resume conversation.
452 + 3. Load recent memory context.
453 + 4. Build system prompt.
454 + 5. Build tool schemas from active tools.
455 + 6. Call provider.
456 + 7. If provider returns text only, persist and reply.
457 + 8. If provider returns tool calls:
458 + a. For each tool call, classify risk.
459 + b. Validate policy.
460 + c. Ask approval when required.
461 + d. Invoke or deny.
462 + e. Write receipt.
463 + f. Persist tool call and result.
464 + 9. Send tool results back to provider.
465 + 10. Repeat until final text or max_tool_rounds is reached.
466 + 11. Persist final assistant response.
467 + 12. Reply to channel.
468 +
469 + Guardrails:
470 +
471 + max_tool_rounds default:
472 + 5
473 +
474 + max_response_bytes default:
475 + 1 MB
476 +
477 + tool execution timeout default:
478 + 30 seconds
479 +
480 + shell timeout default:
481 + 15 seconds
482 +
483 + HTTP timeout default:
484 + 20 seconds
485 +
486 + The runtime must not recursively invoke tools forever.
487 +
488 + Required CLI command surface:
489 +
490 + app init
491 + app onboard
492 + app config validate
493 + app config show
494 + app provider list
495 + app provider test NAME
496 + app tool list
497 + app tool run NAME --json ARGS
498 + app agent
499 + app agent -m MESSAGE
500 + app memory list
501 + app memory search QUERY
502 + app memory show CONVERSATION_ID
503 + app receipt list
504 + app receipt verify
505 + app estop
506 +
507 + Optional commands:
508 +
509 + app service install
510 + app service start
511 + app service stop
512 + app service status
513 + app sop list
514 + app sop validate
515 + app sop run NAME
516 + app plugin list
517 + app plugin install PATH
518 +
519 + SOP engine, optional but valuable:
520 +
521 + Implement deterministic workflows loaded from:
522 +
523 + ~/.appname/workspace/sops/<name>/SOP.toml
524 +
525 + Minimum SOP format:
526 +
527 + name = "daily-check"
528 + description = "Run a daily workspace check"
529 +
530 + [[steps]]
531 + id = "list"
532 + kind = "tool"
533 + tool = "file_list"
534 + args = { path = "." }
535 +
536 + [[steps]]
537 + id = "summarize"
538 + kind = "agent"
539 + prompt = "Summarize the file list from the previous step."
540 +
541 + [[steps]]
542 + id = "approval"
543 + kind = "approval"
544 + prompt = "Continue to write report?"
545 +
546 + [[steps]]
547 + id = "write"
548 + kind = "tool"
549 + tool = "file_write"
550 + args = { path = "daily-check.txt", content_from = "summarize" }
551 +
552 + Requirements:
553 +
554 + - validate step IDs are unique
555 + - validate referenced tools exist
556 + - persist SOP run state
557 + - stop at approval steps until approved
558 + - support on_failure = "abort"
559 + - support on_failure = "continue"
560 +
561 + Plugin system, stretch goal:
562 +
563 + A plugin is a directory:
564 +
565 + plugin-name/
566 + manifest.toml
567 + executable-or-script
568 +
569 + Minimum manifest:
570 +
571 + name = "echo-plugin"
572 + version = "0.1.0"
573 + capabilities = ["tool"]
574 +
575 + [[tools]]
576 + name = "echo"
577 + description = "Echoes input"
578 + command = "./echo-plugin"
579 + schema = { type = "object" }
580 +
581 + The runtime discovers plugins under:
582 +
583 + ~/.appname/plugins/
584 +
585 + Simpler acceptable version:
586 +
587 + Support external process tools where the runtime invokes a configured executable with JSON on stdin and reads JSON from stdout.
588 +
589 + Observability:
590 +
591 + Minimum logging:
592 +
593 + - human-readable logs to stderr
594 + - structured JSON logs when APPNAME_LOG=json, adjusted to the executable name
595 + - never log secrets
596 +
597 + Log events:
598 +
599 + - startup
600 + - config path
601 + - workspace path
602 + - provider selected
603 + - channel started
604 + - conversation started
605 + - tool requested
606 + - tool approved
607 + - tool denied
608 + - tool completed
609 + - tool failed
610 + - receipt written
611 + - memory persisted
612 + - estop triggered
613 +
614 + Optional metrics endpoint:
615 +
616 + GET /metrics
617 +
618 + Expose counters if the endpoint is implemented:
619 +
620 + app_conversations_total
621 + app_tool_calls_total
622 + app_tool_denials_total
623 + app_provider_errors_total
624 + app_receipt_chain_valid
625 +
626 + Emergency stop:
627 +
628 + app estop
629 +
630 + Creates:
631 +
632 + ~/.appname/ESTOP
633 +
634 + When this file exists:
635 +
636 + - no new tool calls may run
637 + - existing long-running shell/http tasks should be cancelled if possible
638 + - the agent may still answer text-only messages explaining that tool use is stopped
639 +
640 + app estop --clear
641 +
642 + Removes the file.
643 +
644 + Acceptance tests:
645 +
646 + Test 1: init creates expected files.
647 +
648 + Given no ~/.appname directory
649 + When app init runs
650 + Then ~/.appname/config file exists
651 + And memory database or memory JSONL exists
652 + And workspace_dir exists
653 +
654 + Test 2: config validation catches invalid autonomy.
655 +
656 + Given autonomy = "godmode"
657 + When app config validate runs
658 + Then exit code is nonzero
659 + And output mentions allowed values
660 +
661 + Test 3: mock provider text-only response.
662 +
663 + Given mock provider fixture returns "hello"
664 + When app agent -m "hi" runs
665 + Then stdout contains "hello"
666 + And memory contains the user and assistant turn
667 +
668 + Test 4: model-triggered file_list tool.
669 +
670 + Given mock provider fixture emits tool_call file_list { path = "." }
671 + When app agent -m "list files" runs
672 + Then file_list executes inside workspace
673 + And a tool receipt is written
674 + And final answer includes the file list summary
675 +
676 + Test 5: workspace escape blocked.
677 +
678 + Given workspace_only = true
679 + When model requests file_read { path = "/etc/passwd" }
680 + Then tool is denied
681 + And a denied receipt is written
682 + And the provider receives a tool error
683 +
684 + Test 6: supervised approval.
685 +
686 + Given autonomy = "supervised"
687 + When model requests file_write
688 + Then CLI asks for approval
689 + And default empty answer denies
690 + And "y" approves
691 +
692 + Test 7: forbidden command blocked.
693 +
694 + When model requests shell { command = "rm -rf /" }
695 + Then tool is blocked before execution
696 + And receipt status is denied
697 +
698 + Test 8: receipt chain detects tampering.
699 +
700 + Given three receipts exist
701 + When the second receipt is edited manually
702 + Then app receipt verify reports invalid chain at receipt 2
703 +
704 + Test 9: provider fallback, if reliable provider is implemented.
705 +
706 + Given reliable provider = [bad_provider, mock_provider]
707 + And bad_provider times out
708 + When agent runs
709 + Then runtime logs fallback
710 + And response comes from mock_provider
711 +
712 + Test 10: memory search.
713 +
714 + Given a previous conversation contains "Aardvark adapter"
715 + When app memory search "aardvark" runs
716 + Then the previous conversation ID is returned
717 +
718 + Implementation priorities:
719 +
720 + First produce a working vertical slice:
721 +
722 + 1. application naming
723 + 2. init
724 + 3. config loading and validation
725 + 4. mock provider
726 + 5. CLI one-shot agent mode
727 + 6. tools: time, file_list, file_read
728 + 7. security policy for workspace paths
729 + 8. memory persistence
730 + 9. receipt writing and verification
731 + 10. tests
732 +
733 + Then add:
734 +
735 + 11. interactive REPL
736 + 12. file_write with approval
737 + 13. shell with blocking rules
738 + 14. HTTP GET tool
739 + 15. OpenAI-compatible provider
740 + 16. optional gateway
741 + 17. optional SOP engine
742 + 18. optional external-process plugins
743 +
744 + Quality requirements:
745 +
746 + - Keep the implementation idiomatic for LANGUAGE.
747 + - Do not quietly substitute another implementation language.
748 + - Do not use Python, JavaScript, Rust, or C as the primary implementation language.
749 + - Shell scripts are acceptable only for setup convenience.
750 + - Prefer simple, boring dependencies.
751 + - Write tests for denied actions, not just successful actions.
752 + - Keep secrets out of logs.
753 + - Keep workspace path handling strict and well-tested.
754 + - Use deterministic mock fixtures so tests do not require network access.
755 + - Update README.md with architecture, config, security policy, commands, and test instructions.
756 +
757 + Do not stop after creating stubs. Implement the core behavior. If a feature is not practical in LANGUAGE, document the limitation and implement the closest useful equivalent.
Newer Older