Revision of ClawClone.prompt

1

+

You are now implementing the real application in this workspace.

2

+

3

+

The target implementation language is LANGUAGE.

4

+

5

+

This workspace should already contain a verified LANGUAGE project skeleton with basic support for CLI, HTTP, SQLite, config parsing, tests, and build/test commands. Begin by inspecting the existing workspace and README before changing anything.

6

+

7

+

Your first task is to name the application.

8

+

9

+

The placeholder name is:

10

+

11

+

*Claw

12

+

13

+

Replace the wildcard with a short, distinctive prefix suitable for this LANGUAGE implementation. Examples of the naming style:

14

+

15

+

LispClaw

16

+

BeamClaw

17

+

LogicClaw

18

+

CrystalClaw

19

+

CobolClaw

20

+

21

+

Do not use an existing project’s name or branding. Once you choose the name, use it consistently for:

22

+

23

+

- executable name

24

+

- README title

25

+

- config directory

26

+

- default workspace directory

27

+

- log names

28

+

- test names where appropriate

29

+

30

+

If the selected name is not suitable for an executable on macOS, create a lowercase/kebab-case executable form and document the relationship. For example:

31

+

32

+

Application name: LogicClaw

33

+

Executable: logicclaw

34

+

35

+

The goal is to build a small, local-first agent runtime. It should run as a single command-line application that can load configuration, talk to a model provider, expose a CLI agent loop, execute a small set of tools through a security gate, persist memory, and write tamper-evident tool receipts.

36

+

37

+

Do not mention or depend on any external agent runtime project. Treat this as a clean-room implementation from this spec.

38

+

39

+

Core workflow:

40

+

41

+

app init

42

+

app config validate

43

+

app config show

44

+

app provider list

45

+

app provider test NAME

46

+

app tool list

47

+

app tool run NAME --json ARGS

48

+

app agent

49

+

app agent -m "What files are in this project?"

50

+

app memory search "previous topic"

51

+

app receipt verify

52

+

app estop

53

+

54

+

Use the final executable name you selected instead of `app`.

55

+

56

+

The agent must:

57

+

58

+

- accept input from the CLI channel

59

+

- send the conversation to a configured model provider

60

+

- advertise available tools to the model

61

+

- parse tool calls from the model response

62

+

- validate each tool call through a security policy

63

+

- execute approved tools

64

+

- feed tool results back into the model

65

+

- persist the final exchange, tool calls, tool results, and receipts

66

+

- return a final answer to the user

67

+

68

+

Required architecture:

69

+

70

+

The implementation must have visible separation of responsibility for these areas:

71

+

72

+

runtime

73

+

agent loop, request lifecycle, orchestration

74

+

75

+

config

76

+

config loading, validation, defaults, path expansion

77

+

78

+

providers

79

+

model provider abstraction and concrete providers

80

+

81

+

channels

82

+

CLI channel and optional HTTP/gateway channel

83

+

84

+

tools

85

+

time, file_list, file_read, file_write, shell, http, memory_search

86

+

87

+

security

88

+

autonomy levels, command/path policy, tool-risk classification

89

+

90

+

memory

91

+

SQLite persistence, or JSONL only if SQLite is impractical in LANGUAGE

92

+

93

+

receipts

94

+

tamper-evident tool-call receipts

95

+

96

+

sop

97

+

optional deterministic workflow runner

98

+

99

+

service

100

+

optional install/start/stop/status wrappers

101

+

102

+

Do not force object-oriented structure if LANGUAGE is not object-oriented. Use idiomatic LANGUAGE design, but preserve the conceptual boundaries.

103

+

104

+

Configuration:

105

+

106

+

The application must use a user-editable config file.

107

+

108

+

Default location should be based on the final app name, for example:

109

+

110

+

~/.logicclaw/config.toml

111

+

112

+

TOML is preferred. JSON, INI, S-expression, or another idiomatic config format is acceptable if TOML support is weak in LANGUAGE. If not using TOML, document why.

113

+

114

+

Minimum config shape:

115

+

116

+

workspace_dir = "~/logicclaw-workspace"

117

+

default_provider = "local"

118

+

default_model = "mock"

119

+

120

+

[security]

121

+

autonomy = "supervised"

122

+

workspace_only = true

123

+

forbidden_paths = ["/etc", "/sys", "/boot", "~/.ssh"]

124

+

forbidden_commands = ["rm", "shutdown", "reboot", "mkfs", "dd"]

125

+

audit_log = true

126

+

127

+

[providers.models.local]

128

+

kind = "mock"

129

+

model = "mock"

130

+

131

+

[providers.models.openai_compatible]

132

+

kind = "openai-compatible"

133

+

base_url = "http://localhost:1234/v1"

134

+

model = "local-model"

135

+

api_key_env = "OPENAI_API_KEY"

136

+

137

+

[channels.cli]

138

+

enabled = true

139

+

tools_allow = ["file_read", "file_list", "time", "memory_search", "shell"]

140

+

141

+

[memory]

142

+

backend = "sqlite"

143

+

path = "~/.logicclaw/memory.sqlite"

144

+

145

+

[receipts]

146

+

enabled = true

147

+

path = "~/.logicclaw/tool_receipts.log"

148

+

149

+

Adjust paths to match the application name you chose.

150

+

151

+

Config requirements:

152

+

153

+

- load defaults when keys are absent

154

+

- expand ~ and environment variables

155

+

- validate enum values

156

+

- validate that workspace exists or create it during init

157

+

- do not require API keys for mock mode

158

+

- support provider credentials by environment variable

159

+

- never print secret values in logs or config dumps

160

+

- config validate must report all detected errors in one pass when practical

161

+

162

+

Provider abstraction:

163

+

164

+

Create an idiomatic equivalent of:

165

+

166

+

Provider

167

+

name() -> string

168

+

capabilities() -> ProviderCapabilities

169

+

chat(request: ChatRequest) -> ChatResponse

170

+

171

+

ChatRequest must contain:

172

+

173

+

- system_prompt

174

+

- messages

175

+

- tools

176

+

- model

177

+

- optional temperature

178

+

- optional metadata

179

+

180

+

ChatResponse must contain:

181

+

182

+

- final_text

183

+

- tool_calls

184

+

- optional raw_provider_payload

185

+

- optional usage

186

+

187

+

Required providers:

188

+

189

+

mock

190

+

Deterministic provider used for tests. It must be able to return ordinary text and tool calls from scripted fixtures.

191

+

192

+

openai-compatible

193

+

Sends requests to an OpenAI-compatible /chat/completions endpoint. Full support for every provider is not required. Implement non-streaming chat completion. Tool/function call support is required if reasonably practical in LANGUAGE; otherwise document the limitation clearly.

194

+

195

+

Optional providers:

196

+

197

+

reliable

198

+

Wrapper provider that tries provider names in order and falls back on network/auth/timeout errors.

199

+

200

+

router

201

+

Wrapper provider that chooses a provider from request metadata hints.

202

+

203

+

Channel abstraction:

204

+

205

+

Create an idiomatic equivalent of:

206

+

207

+

Channel

208

+

name() -> string

209

+

start(runtime_handle)

210

+

send(conversation_id, message)

211

+

supports_draft_updates() -> bool

212

+

213

+

Required channel:

214

+

215

+

cli

216

+

217

+

CLI behavior:

218

+

219

+

app agent

220

+

starts a REPL

221

+

222

+

app agent -m "message"

223

+

runs one turn and exits

224

+

225

+

REPL commands:

226

+

227

+

/exit

228

+

exits

229

+

230

+

/tools

231

+

lists active tools

232

+

233

+

/memory <query>

234

+

searches memory

235

+

236

+

/policy

237

+

prints current autonomy and workspace boundary

238

+

239

+

Optional gateway channel:

240

+

241

+

localhost HTTP server

242

+

243

+

Minimum optional gateway endpoints:

244

+

245

+

GET /health

246

+

GET /status

247

+

GET /tools

248

+

POST /chat

249

+

GET /memory/search?q=...

250

+

GET /receipts

251

+

POST /estop

252

+

253

+

Tool abstraction:

254

+

255

+

Create an idiomatic equivalent of:

256

+

257

+

Tool

258

+

name() -> string

259

+

description() -> string

260

+

parameters_schema() -> JSON Schema object or equivalent metadata

261

+

risk(args, context) -> low | medium | high

262

+

invoke(args, context) -> ToolResult

263

+

264

+

ToolResult must contain:

265

+

266

+

- success: bool

267

+

- output: string

268

+

- optional error: string

269

+

- optional metadata

270

+

- optional receipt_id

271

+

272

+

Required built-in tools:

273

+

274

+

time

275

+

Returns current local time, UTC time, and timezone if available.

276

+

277

+

file_list

278

+

Lists files under a path inside workspace.

279

+

280

+

file_read

281

+

Reads a UTF-8 text file inside workspace.

282

+

283

+

file_write

284

+

Writes a UTF-8 text file inside workspace.

285

+

286

+

shell

287

+

Executes a shell command inside workspace, subject to security policy.

288

+

289

+

http

290

+

Performs HTTP GET. POST is optional.

291

+

292

+

memory_search

293

+

Searches persisted conversations.

294

+

295

+

Optional tools:

296

+

297

+

web_search

298

+

May be stubbed unless a search API key is configured.

299

+

300

+

pdf_extract

301

+

Optional.

302

+

303

+

ask_user

304

+

In CLI mode, asks the user a question and returns the answer.

305

+

306

+

Security model:

307

+

308

+

Implement three autonomy levels:

309

+

310

+

readonly

311

+

Low-risk read-only tools allowed.

312

+

No file_write.

313

+

No shell execution except optionally harmless commands such as pwd.

314

+

315

+

supervised

316

+

Low-risk tools run automatically.

317

+

Medium-risk tools require operator approval.

318

+

High-risk tools are blocked.

319

+

320

+

full

321

+

Low and medium run automatically.

322

+

High-risk is still blocked if explicitly forbidden by path or command policy.

323

+

324

+

Default must be:

325

+

326

+

supervised

327

+

328

+

Risk rules:

329

+

330

+

time, memory_search, file_list, file_read inside workspace:

331

+

low

332

+

333

+

http GET to allowed domains:

334

+

low

335

+

336

+

file_write inside workspace:

337

+

medium

338

+

339

+

shell command from allowlist:

340

+

medium

341

+

342

+

shell command not on allowlist:

343

+

high

344

+

345

+

any path outside workspace when workspace_only = true:

346

+

blocked

347

+

348

+

any path under forbidden_paths:

349

+

blocked

350

+

351

+

any command whose basename appears in forbidden_commands:

352

+

blocked

353

+

354

+

Any shell command containing obvious destructive patterns must be blocked. Minimum patterns:

355

+

356

+

rm -rf /

357

+

rm -rf *

358

+

mkfs

359

+

dd if=

360

+

:(){ :|:& };:

361

+

shutdown

362

+

reboot

363

+

chmod -R 777 /

364

+

chown -R

365

+

curl ... | sh

366

+

wget ... | sh

367

+

368

+

Approval flow in CLI mode:

369

+

370

+

When a medium-risk action requires approval, print something like:

371

+

372

+

Tool request:

373

+

tool: file_write

374

+

risk: medium

375

+

reason: writes to workspace

376

+

args: ...

377

+

Approve? [y/N]

378

+

379

+

Default is deny.

380

+

381

+

Tool receipts:

382

+

383

+

Every attempted tool invocation must produce a receipt whether it is allowed, denied, failed, or approved.

384

+

385

+

Receipt fields:

386

+

387

+

{

388

+

"id": "receipt-...",

389

+

"timestamp": "2026-05-12T14:00:00Z",

390

+

"conversation_id": "...",

391

+

"tool": "file_read",

392

+

"args_hash": "...",

393

+

"result_hash": "...",

394

+

"status": "allowed|denied|failed",

395

+

"risk": "low|medium|high",

396

+

"previous_hash": "...",

397

+

"receipt_hash": "..."

398

+

}

399

+

400

+

Receipt hash:

401

+

402

+

receipt_hash = SHA256(canonical_json(receipt_without_receipt_hash))

403

+

404

+

Tamper-evident chain:

405

+

406

+

- each receipt includes the previous receipt’s hash

407

+

- receipt verify must replay the log

408

+

- it must report the first broken link

409

+

410

+

Optional stronger version:

411

+

412

+

HMAC-SHA256 with a locally stored secret key

413

+

414

+

Memory:

415

+

416

+

Use SQLite if practical in LANGUAGE. Use JSONL only if SQLite support is impractical or broken.

417

+

418

+

Persist:

419

+

420

+

- conversation_id

421

+

- turn_id

422

+

- timestamp

423

+

- role

424

+

- content

425

+

- tool_calls

426

+

- tool_results

427

+

- provider

428

+

- model

429

+

- metadata

430

+

431

+

Required commands:

432

+

433

+

app memory search QUERY

434

+

app memory show CONVERSATION_ID

435

+

app memory list

436

+

app memory clear --yes

437

+

438

+

Search may be simple substring search.

439

+

440

+

Optional scoring:

441

+

442

+

- tokenize query and content

443

+

- rank by term frequency

444

+

- boost recent conversations

445

+

446

+

Agent loop:

447

+

448

+

Implement this loop:

449

+

450

+

1. Receive user message from channel.

451

+

2. Create or resume conversation.

452

+

3. Load recent memory context.

453

+

4. Build system prompt.

454

+

5. Build tool schemas from active tools.

455

+

6. Call provider.

456

+

7. If provider returns text only, persist and reply.

457

+

8. If provider returns tool calls:

458

+

a. For each tool call, classify risk.

459

+

b. Validate policy.

460

+

c. Ask approval when required.

461

+

d. Invoke or deny.

462

+

e. Write receipt.

463

+

f. Persist tool call and result.

464

+

9. Send tool results back to provider.

465

+

10. Repeat until final text or max_tool_rounds is reached.

466

+

11. Persist final assistant response.

467

+

12. Reply to channel.

468

+

469

+

Guardrails:

470

+

471

+

max_tool_rounds default:

472

+

5

473

+

474

+

max_response_bytes default:

475

+

1 MB

476

+

477

+

tool execution timeout default:

478

+

30 seconds

479

+

480

+

shell timeout default:

481

+

15 seconds

482

+

483

+

HTTP timeout default:

484

+

20 seconds

485

+

486

+

The runtime must not recursively invoke tools forever.

487

+

488

+

Required CLI command surface:

489

+

490

+

app init

491

+

app onboard

492

+

app config validate

493

+

app config show

494

+

app provider list

495

+

app provider test NAME

496

+

app tool list

497

+

app tool run NAME --json ARGS

498

+

app agent

499

+

app agent -m MESSAGE

500

+

app memory list

501

+

app memory search QUERY

502

+

app memory show CONVERSATION_ID

503

+

app receipt list

504

+

app receipt verify

505

+

app estop

506

+

507

+

Optional commands:

508

+

509

+

app service install

510

+

app service start

511

+

app service stop

512

+

app service status

513

+

app sop list

514

+

app sop validate

515

+

app sop run NAME

516

+

app plugin list

517

+

app plugin install PATH

518

+

519

+

SOP engine, optional but valuable:

520

+

521

+

Implement deterministic workflows loaded from:

522

+

523

+

~/.appname/workspace/sops/<name>/SOP.toml

524

+

525

+

Minimum SOP format:

526

+

527

+

name = "daily-check"

528

+

description = "Run a daily workspace check"

529

+

530

+

[[steps]]

531

+

id = "list"

532

+

kind = "tool"

533

+

tool = "file_list"

534

+

args = { path = "." }

535

+

536

+

[[steps]]

537

+

id = "summarize"

538

+

kind = "agent"

539

+

prompt = "Summarize the file list from the previous step."

540

+

541

+

[[steps]]

542

+

id = "approval"

543

+

kind = "approval"

544

+

prompt = "Continue to write report?"

545

+

546

+

[[steps]]

547

+

id = "write"

548

+

kind = "tool"

549

+

tool = "file_write"

550

+

args = { path = "daily-check.txt", content_from = "summarize" }

551

+

552

+

Requirements:

553

+

554

+

- validate step IDs are unique

555

+

- validate referenced tools exist

556

+

- persist SOP run state

557

+

- stop at approval steps until approved

558

+

- support on_failure = "abort"

559

+

- support on_failure = "continue"

560

+

561

+

Plugin system, stretch goal:

562

+

563

+

A plugin is a directory:

564

+

565

+

plugin-name/

566

+

manifest.toml

567

+

executable-or-script

568

+

569

+

Minimum manifest:

570

+

571

+

name = "echo-plugin"

572

+

version = "0.1.0"

573

+

capabilities = ["tool"]

574

+

575

+

[[tools]]

576

+

name = "echo"

577

+

description = "Echoes input"

578

+

command = "./echo-plugin"

579

+

schema = { type = "object" }

580

+

581

+

The runtime discovers plugins under:

582

+

583

+

~/.appname/plugins/

584

+

585

+

Simpler acceptable version:

586

+

587

+

Support external process tools where the runtime invokes a configured executable with JSON on stdin and reads JSON from stdout.

588

+

589

+

Observability:

590

+

591

+

Minimum logging:

592

+

593

+

- human-readable logs to stderr

594

+

- structured JSON logs when APPNAME_LOG=json, adjusted to the executable name

595

+

- never log secrets

596

+

597

+

Log events:

598

+

599

+

- startup

600

+

- config path

601

+

- workspace path

602

+

- provider selected

603

+

- channel started

604

+

- conversation started

605

+

- tool requested

606

+

- tool approved

607

+

- tool denied

608

+

- tool completed

609

+

- tool failed

610

+

- receipt written

611

+

- memory persisted

612

+

- estop triggered

613

+

614

+

Optional metrics endpoint:

615

+

616

+

GET /metrics

617

+

618

+

Expose counters if the endpoint is implemented:

619

+

620

+

app_conversations_total

621

+

app_tool_calls_total

622

+

app_tool_denials_total

623

+

app_provider_errors_total

624

+

app_receipt_chain_valid

625

+

626

+

Emergency stop:

627

+

628

+

app estop

629

+

630

+

Creates:

631

+

632

+

~/.appname/ESTOP

633

+

634

+

When this file exists:

635

+

636

+

- no new tool calls may run

637

+

- existing long-running shell/http tasks should be cancelled if possible

638

+

- the agent may still answer text-only messages explaining that tool use is stopped

639

+

640

+

app estop --clear

641

+

642

+

Removes the file.

643

+

644

+

Acceptance tests:

645

+

646

+

Test 1: init creates expected files.

647

+

648

+

Given no ~/.appname directory

649

+

When app init runs

650

+

Then ~/.appname/config file exists

651

+

And memory database or memory JSONL exists

652

+

And workspace_dir exists

653

+

654

+

Test 2: config validation catches invalid autonomy.

655

+

656

+

Given autonomy = "godmode"

657

+

When app config validate runs

658

+

Then exit code is nonzero

659

+

And output mentions allowed values

660

+

661

+

Test 3: mock provider text-only response.

662

+

663

+

Given mock provider fixture returns "hello"

664

+

When app agent -m "hi" runs

665

+

Then stdout contains "hello"

666

+

And memory contains the user and assistant turn

667

+

668

+

Test 4: model-triggered file_list tool.

669

+

670

+

Given mock provider fixture emits tool_call file_list { path = "." }

671

+

When app agent -m "list files" runs

672

+

Then file_list executes inside workspace

673

+

And a tool receipt is written

674

+

And final answer includes the file list summary

675

+

676

+

Test 5: workspace escape blocked.

677

+

678

+

Given workspace_only = true

679

+

When model requests file_read { path = "/etc/passwd" }

680

+

Then tool is denied

681

+

And a denied receipt is written

682

+

And the provider receives a tool error

683

+

684

+

Test 6: supervised approval.

685

+

686

+

Given autonomy = "supervised"

687

+

When model requests file_write

688

+

Then CLI asks for approval

689

+

And default empty answer denies

690

+

And "y" approves

691

+

692

+

Test 7: forbidden command blocked.

693

+

694

+

When model requests shell { command = "rm -rf /" }

695

+

Then tool is blocked before execution

696

+

And receipt status is denied

697

+

698

+

Test 8: receipt chain detects tampering.

699

+

700

+

Given three receipts exist

701

+

When the second receipt is edited manually

702

+

Then app receipt verify reports invalid chain at receipt 2

703

+

704

+

Test 9: provider fallback, if reliable provider is implemented.

705

+

706

+

Given reliable provider = [bad_provider, mock_provider]

707

+

And bad_provider times out

708

+

When agent runs

709

+

Then runtime logs fallback

710

+

And response comes from mock_provider

711

+

712

+

Test 10: memory search.

713

+

714

+

Given a previous conversation contains "Aardvark adapter"

715

+

When app memory search "aardvark" runs

716

+

Then the previous conversation ID is returned

717

+

718

+

Implementation priorities:

719

+

720

+

First produce a working vertical slice:

721

+

722

+

1. application naming

723

+

2. init

724

+

3. config loading and validation

725

+

4. mock provider

726

+

5. CLI one-shot agent mode

727

+

6. tools: time, file_list, file_read

728

+

7. security policy for workspace paths

729

+

8. memory persistence

730

+

9. receipt writing and verification

731

+

10. tests

732

+

733

+

Then add:

734

+

735

+

11. interactive REPL

736

+

12. file_write with approval

737

+

13. shell with blocking rules

738

+

14. HTTP GET tool

739

+

15. OpenAI-compatible provider

740

+

16. optional gateway

741

+

17. optional SOP engine

742

+

18. optional external-process plugins

743

+

744

+

Quality requirements:

745

+

746

+

- Keep the implementation idiomatic for LANGUAGE.

747

+

- Do not quietly substitute another implementation language.

748

+

- Do not use Python, JavaScript, Rust, or C as the primary implementation language.

749

+

- Shell scripts are acceptable only for setup convenience.

750

+

- Prefer simple, boring dependencies.

751

+

- Write tests for denied actions, not just successful actions.

752

+

- Keep secrets out of logs.

753

+

- Keep workspace path handling strict and well-tested.

754

+

- Use deterministic mock fixtures so tests do not require network access.

755

+

- Update README.md with architecture, config, security policy, commands, and test instructions.

756

+

757

+

Do not stop after creating stubs. Implement the core behavior. If a feature is not practical in LANGUAGE, document the limitation and implement the closest useful equivalent.

cadwaladyr / ClawClone.prompt

cadwaladyr revised this gist 3 weeks ago. Go to revision

cadwaladyr revised this gist 3 weeks ago. Go to revision

			@@ -10,13 +10,7 @@ The placeholder name is:
10	10
11	11		*Claw
12	12
13		-	Replace the wildcard with a short, distinctive prefix suitable for this LANGUAGE implementation. Examples of the naming style:
14		-
15		-	LispClaw
16		-	BeamClaw
17		-	LogicClaw
18		-	CrystalClaw
19		-	CobolClaw
	13	+	Replace the wildcard with a short, distinctive prefix suitable for this implementation.
20	14
21	15		Do not use an existing project’s name or branding. Once you choose the name, use it consistently for:
22	16

		@@ -0,0 +1,757 @@
1	+	You are now implementing the real application in this workspace.
2	+
3	+	The target implementation language is LANGUAGE.
4	+
5	+	This workspace should already contain a verified LANGUAGE project skeleton with basic support for CLI, HTTP, SQLite, config parsing, tests, and build/test commands. Begin by inspecting the existing workspace and README before changing anything.
6	+
7	+	Your first task is to name the application.
8	+
9	+	The placeholder name is:
10	+
11	+	*Claw
12	+
13	+	Replace the wildcard with a short, distinctive prefix suitable for this LANGUAGE implementation. Examples of the naming style:
14	+
15	+	LispClaw
16	+	BeamClaw
17	+	LogicClaw
18	+	CrystalClaw
19	+	CobolClaw
20	+
21	+	Do not use an existing project’s name or branding. Once you choose the name, use it consistently for:
22	+
23	+	- executable name
24	+	- README title
25	+	- config directory
26	+	- default workspace directory
27	+	- log names
28	+	- test names where appropriate
29	+
30	+	If the selected name is not suitable for an executable on macOS, create a lowercase/kebab-case executable form and document the relationship. For example:
31	+
32	+	Application name: LogicClaw
33	+	Executable: logicclaw
34	+
35	+	The goal is to build a small, local-first agent runtime. It should run as a single command-line application that can load configuration, talk to a model provider, expose a CLI agent loop, execute a small set of tools through a security gate, persist memory, and write tamper-evident tool receipts.
36	+
37	+	Do not mention or depend on any external agent runtime project. Treat this as a clean-room implementation from this spec.
38	+
39	+	Core workflow:
40	+
41	+	app init
42	+	app config validate
43	+	app config show
44	+	app provider list
45	+	app provider test NAME
46	+	app tool list
47	+	app tool run NAME --json ARGS
48	+	app agent
49	+	app agent -m "What files are in this project?"
50	+	app memory search "previous topic"
51	+	app receipt verify
52	+	app estop
53	+
54	+	Use the final executable name you selected instead of `app`.
55	+
56	+	The agent must:
57	+
58	+	- accept input from the CLI channel
59	+	- send the conversation to a configured model provider
60	+	- advertise available tools to the model
61	+	- parse tool calls from the model response
62	+	- validate each tool call through a security policy
63	+	- execute approved tools
64	+	- feed tool results back into the model
65	+	- persist the final exchange, tool calls, tool results, and receipts
66	+	- return a final answer to the user
67	+
68	+	Required architecture:
69	+
70	+	The implementation must have visible separation of responsibility for these areas:
71	+
72	+	runtime
73	+	agent loop, request lifecycle, orchestration
74	+
75	+	config
76	+	config loading, validation, defaults, path expansion
77	+
78	+	providers
79	+	model provider abstraction and concrete providers
80	+
81	+	channels
82	+	CLI channel and optional HTTP/gateway channel
83	+
84	+	tools
85	+	time, file_list, file_read, file_write, shell, http, memory_search
86	+
87	+	security
88	+	autonomy levels, command/path policy, tool-risk classification
89	+
90	+	memory
91	+	SQLite persistence, or JSONL only if SQLite is impractical in LANGUAGE
92	+
93	+	receipts
94	+	tamper-evident tool-call receipts
95	+
96	+	sop
97	+	optional deterministic workflow runner
98	+
99	+	service
100	+	optional install/start/stop/status wrappers
101	+
102	+	Do not force object-oriented structure if LANGUAGE is not object-oriented. Use idiomatic LANGUAGE design, but preserve the conceptual boundaries.
103	+
104	+	Configuration:
105	+
106	+	The application must use a user-editable config file.
107	+
108	+	Default location should be based on the final app name, for example:
109	+
110	+	~/.logicclaw/config.toml
111	+
112	+	TOML is preferred. JSON, INI, S-expression, or another idiomatic config format is acceptable if TOML support is weak in LANGUAGE. If not using TOML, document why.
113	+
114	+	Minimum config shape:
115	+
116	+	workspace_dir = "~/logicclaw-workspace"
117	+	default_provider = "local"
118	+	default_model = "mock"
119	+
120	+	[security]
121	+	autonomy = "supervised"
122	+	workspace_only = true
123	+	forbidden_paths = ["/etc", "/sys", "/boot", "~/.ssh"]
124	+	forbidden_commands = ["rm", "shutdown", "reboot", "mkfs", "dd"]
125	+	audit_log = true
126	+
127	+	[providers.models.local]
128	+	kind = "mock"
129	+	model = "mock"
130	+
131	+	[providers.models.openai_compatible]
132	+	kind = "openai-compatible"
133	+	base_url = "http://localhost:1234/v1"
134	+	model = "local-model"
135	+	api_key_env = "OPENAI_API_KEY"
136	+
137	+	[channels.cli]
138	+	enabled = true
139	+	tools_allow = ["file_read", "file_list", "time", "memory_search", "shell"]
140	+
141	+	[memory]
142	+	backend = "sqlite"
143	+	path = "~/.logicclaw/memory.sqlite"
144	+
145	+	[receipts]
146	+	enabled = true
147	+	path = "~/.logicclaw/tool_receipts.log"
148	+
149	+	Adjust paths to match the application name you chose.
150	+
151	+	Config requirements:
152	+
153	+	- load defaults when keys are absent
154	+	- expand ~ and environment variables
155	+	- validate enum values
156	+	- validate that workspace exists or create it during init
157	+	- do not require API keys for mock mode
158	+	- support provider credentials by environment variable
159	+	- never print secret values in logs or config dumps
160	+	- config validate must report all detected errors in one pass when practical
161	+
162	+	Provider abstraction:
163	+
164	+	Create an idiomatic equivalent of:
165	+
166	+	Provider
167	+	name() -> string
168	+	capabilities() -> ProviderCapabilities
169	+	chat(request: ChatRequest) -> ChatResponse
170	+
171	+	ChatRequest must contain:
172	+
173	+	- system_prompt
174	+	- messages
175	+	- tools
176	+	- model
177	+	- optional temperature
178	+	- optional metadata
179	+
180	+	ChatResponse must contain:
181	+
182	+	- final_text
183	+	- tool_calls
184	+	- optional raw_provider_payload
185	+	- optional usage
186	+
187	+	Required providers:
188	+
189	+	mock
190	+	Deterministic provider used for tests. It must be able to return ordinary text and tool calls from scripted fixtures.
191	+
192	+	openai-compatible
193	+	Sends requests to an OpenAI-compatible /chat/completions endpoint. Full support for every provider is not required. Implement non-streaming chat completion. Tool/function call support is required if reasonably practical in LANGUAGE; otherwise document the limitation clearly.
194	+
195	+	Optional providers:
196	+
197	+	reliable
198	+	Wrapper provider that tries provider names in order and falls back on network/auth/timeout errors.
199	+
200	+	router
201	+	Wrapper provider that chooses a provider from request metadata hints.
202	+
203	+	Channel abstraction:
204	+
205	+	Create an idiomatic equivalent of:
206	+
207	+	Channel
208	+	name() -> string
209	+	start(runtime_handle)
210	+	send(conversation_id, message)
211	+	supports_draft_updates() -> bool
212	+
213	+	Required channel:
214	+
215	+	cli
216	+
217	+	CLI behavior:
218	+
219	+	app agent
220	+	starts a REPL
221	+
222	+	app agent -m "message"
223	+	runs one turn and exits
224	+
225	+	REPL commands:
226	+
227	+	/exit
228	+	exits
229	+
230	+	/tools
231	+	lists active tools
232	+
233	+	/memory <query>
234	+	searches memory
235	+
236	+	/policy
237	+	prints current autonomy and workspace boundary
238	+
239	+	Optional gateway channel:
240	+
241	+	localhost HTTP server
242	+
243	+	Minimum optional gateway endpoints:
244	+
245	+	GET /health
246	+	GET /status
247	+	GET /tools
248	+	POST /chat
249	+	GET /memory/search?q=...
250	+	GET /receipts
251	+	POST /estop
252	+
253	+	Tool abstraction:
254	+
255	+	Create an idiomatic equivalent of:
256	+
257	+	Tool
258	+	name() -> string
259	+	description() -> string
260	+	parameters_schema() -> JSON Schema object or equivalent metadata
261	+	risk(args, context) -> low \| medium \| high
262	+	invoke(args, context) -> ToolResult
263	+
264	+	ToolResult must contain:
265	+
266	+	- success: bool
267	+	- output: string
268	+	- optional error: string
269	+	- optional metadata
270	+	- optional receipt_id
271	+
272	+	Required built-in tools:
273	+
274	+	time
275	+	Returns current local time, UTC time, and timezone if available.
276	+
277	+	file_list
278	+	Lists files under a path inside workspace.
279	+
280	+	file_read
281	+	Reads a UTF-8 text file inside workspace.
282	+
283	+	file_write
284	+	Writes a UTF-8 text file inside workspace.
285	+
286	+	shell
287	+	Executes a shell command inside workspace, subject to security policy.
288	+
289	+	http
290	+	Performs HTTP GET. POST is optional.
291	+
292	+	memory_search
293	+	Searches persisted conversations.
294	+
295	+	Optional tools:
296	+
297	+	web_search
298	+	May be stubbed unless a search API key is configured.
299	+
300	+	pdf_extract
301	+	Optional.
302	+
303	+	ask_user
304	+	In CLI mode, asks the user a question and returns the answer.
305	+
306	+	Security model:
307	+
308	+	Implement three autonomy levels:
309	+
310	+	readonly
311	+	Low-risk read-only tools allowed.
312	+	No file_write.
313	+	No shell execution except optionally harmless commands such as pwd.
314	+
315	+	supervised
316	+	Low-risk tools run automatically.
317	+	Medium-risk tools require operator approval.
318	+	High-risk tools are blocked.
319	+
320	+	full
321	+	Low and medium run automatically.
322	+	High-risk is still blocked if explicitly forbidden by path or command policy.
323	+
324	+	Default must be:
325	+
326	+	supervised
327	+
328	+	Risk rules:
329	+
330	+	time, memory_search, file_list, file_read inside workspace:
331	+	low
332	+
333	+	http GET to allowed domains:
334	+	low
335	+
336	+	file_write inside workspace:
337	+	medium
338	+
339	+	shell command from allowlist:
340	+	medium
341	+
342	+	shell command not on allowlist:
343	+	high
344	+
345	+	any path outside workspace when workspace_only = true:
346	+	blocked
347	+
348	+	any path under forbidden_paths:
349	+	blocked
350	+
351	+	any command whose basename appears in forbidden_commands:
352	+	blocked
353	+
354	+	Any shell command containing obvious destructive patterns must be blocked. Minimum patterns:
355	+
356	+	rm -rf /
357	+	rm -rf *
358	+	mkfs
359	+	dd if=
360	+	:(){ :\|:& };:
361	+	shutdown
362	+	reboot
363	+	chmod -R 777 /
364	+	chown -R
365	+	curl ... \| sh
366	+	wget ... \| sh
367	+
368	+	Approval flow in CLI mode:
369	+
370	+	When a medium-risk action requires approval, print something like:
371	+
372	+	Tool request:
373	+	tool: file_write
374	+	risk: medium
375	+	reason: writes to workspace
376	+	args: ...
377	+	Approve? [y/N]
378	+
379	+	Default is deny.
380	+
381	+	Tool receipts:
382	+
383	+	Every attempted tool invocation must produce a receipt whether it is allowed, denied, failed, or approved.
384	+
385	+	Receipt fields:
386	+
387	+	{
388	+	"id": "receipt-...",
389	+	"timestamp": "2026-05-12T14:00:00Z",
390	+	"conversation_id": "...",
391	+	"tool": "file_read",
392	+	"args_hash": "...",
393	+	"result_hash": "...",
394	+	"status": "allowed\|denied\|failed",
395	+	"risk": "low\|medium\|high",
396	+	"previous_hash": "...",
397	+	"receipt_hash": "..."
398	+	}
399	+
400	+	Receipt hash:
401	+
402	+	receipt_hash = SHA256(canonical_json(receipt_without_receipt_hash))
403	+
404	+	Tamper-evident chain:
405	+
406	+	- each receipt includes the previous receipt’s hash
407	+	- receipt verify must replay the log
408	+	- it must report the first broken link
409	+
410	+	Optional stronger version:
411	+
412	+	HMAC-SHA256 with a locally stored secret key
413	+
414	+	Memory:
415	+
416	+	Use SQLite if practical in LANGUAGE. Use JSONL only if SQLite support is impractical or broken.
417	+
418	+	Persist:
419	+
420	+	- conversation_id
421	+	- turn_id
422	+	- timestamp
423	+	- role
424	+	- content
425	+	- tool_calls
426	+	- tool_results
427	+	- provider
428	+	- model
429	+	- metadata
430	+
431	+	Required commands:
432	+
433	+	app memory search QUERY
434	+	app memory show CONVERSATION_ID
435	+	app memory list
436	+	app memory clear --yes
437	+
438	+	Search may be simple substring search.
439	+
440	+	Optional scoring:
441	+
442	+	- tokenize query and content
443	+	- rank by term frequency
444	+	- boost recent conversations
445	+
446	+	Agent loop:
447	+
448	+	Implement this loop:
449	+
450	+	1. Receive user message from channel.
451	+	2. Create or resume conversation.
452	+	3. Load recent memory context.
453	+	4. Build system prompt.
454	+	5. Build tool schemas from active tools.
455	+	6. Call provider.
456	+	7. If provider returns text only, persist and reply.
457	+	8. If provider returns tool calls:
458	+	a. For each tool call, classify risk.
459	+	b. Validate policy.
460	+	c. Ask approval when required.
461	+	d. Invoke or deny.
462	+	e. Write receipt.
463	+	f. Persist tool call and result.
464	+	9. Send tool results back to provider.
465	+	10. Repeat until final text or max_tool_rounds is reached.
466	+	11. Persist final assistant response.
467	+	12. Reply to channel.
468	+
469	+	Guardrails:
470	+
471	+	max_tool_rounds default:
472	+	5
473	+
474	+	max_response_bytes default:
475	+	1 MB
476	+
477	+	tool execution timeout default:
478	+	30 seconds
479	+
480	+	shell timeout default:
481	+	15 seconds
482	+
483	+	HTTP timeout default:
484	+	20 seconds
485	+
486	+	The runtime must not recursively invoke tools forever.
487	+
488	+	Required CLI command surface:
489	+
490	+	app init
491	+	app onboard
492	+	app config validate
493	+	app config show
494	+	app provider list
495	+	app provider test NAME
496	+	app tool list
497	+	app tool run NAME --json ARGS
498	+	app agent
499	+	app agent -m MESSAGE
500	+	app memory list
501	+	app memory search QUERY
502	+	app memory show CONVERSATION_ID
503	+	app receipt list
504	+	app receipt verify
505	+	app estop
506	+
507	+	Optional commands:
508	+
509	+	app service install
510	+	app service start
511	+	app service stop
512	+	app service status
513	+	app sop list
514	+	app sop validate
515	+	app sop run NAME
516	+	app plugin list
517	+	app plugin install PATH
518	+
519	+	SOP engine, optional but valuable:
520	+
521	+	Implement deterministic workflows loaded from:
522	+
523	+	~/.appname/workspace/sops/<name>/SOP.toml
524	+
525	+	Minimum SOP format:
526	+
527	+	name = "daily-check"
528	+	description = "Run a daily workspace check"
529	+
530	+	[[steps]]
531	+	id = "list"
532	+	kind = "tool"
533	+	tool = "file_list"
534	+	args = { path = "." }
535	+
536	+	[[steps]]
537	+	id = "summarize"
538	+	kind = "agent"
539	+	prompt = "Summarize the file list from the previous step."
540	+
541	+	[[steps]]
542	+	id = "approval"
543	+	kind = "approval"
544	+	prompt = "Continue to write report?"
545	+
546	+	[[steps]]
547	+	id = "write"
548	+	kind = "tool"
549	+	tool = "file_write"
550	+	args = { path = "daily-check.txt", content_from = "summarize" }
551	+
552	+	Requirements:
553	+
554	+	- validate step IDs are unique
555	+	- validate referenced tools exist
556	+	- persist SOP run state
557	+	- stop at approval steps until approved
558	+	- support on_failure = "abort"
559	+	- support on_failure = "continue"
560	+
561	+	Plugin system, stretch goal:
562	+
563	+	A plugin is a directory:
564	+
565	+	plugin-name/
566	+	manifest.toml
567	+	executable-or-script
568	+
569	+	Minimum manifest:
570	+
571	+	name = "echo-plugin"
572	+	version = "0.1.0"
573	+	capabilities = ["tool"]
574	+
575	+	[[tools]]
576	+	name = "echo"
577	+	description = "Echoes input"
578	+	command = "./echo-plugin"
579	+	schema = { type = "object" }
580	+
581	+	The runtime discovers plugins under:
582	+
583	+	~/.appname/plugins/
584	+
585	+	Simpler acceptable version:
586	+
587	+	Support external process tools where the runtime invokes a configured executable with JSON on stdin and reads JSON from stdout.
588	+
589	+	Observability:
590	+
591	+	Minimum logging:
592	+
593	+	- human-readable logs to stderr
594	+	- structured JSON logs when APPNAME_LOG=json, adjusted to the executable name
595	+	- never log secrets
596	+
597	+	Log events:
598	+
599	+	- startup
600	+	- config path
601	+	- workspace path
602	+	- provider selected
603	+	- channel started
604	+	- conversation started
605	+	- tool requested
606	+	- tool approved
607	+	- tool denied
608	+	- tool completed
609	+	- tool failed
610	+	- receipt written
611	+	- memory persisted
612	+	- estop triggered
613	+
614	+	Optional metrics endpoint:
615	+
616	+	GET /metrics
617	+
618	+	Expose counters if the endpoint is implemented:
619	+
620	+	app_conversations_total
621	+	app_tool_calls_total
622	+	app_tool_denials_total
623	+	app_provider_errors_total
624	+	app_receipt_chain_valid
625	+
626	+	Emergency stop:
627	+
628	+	app estop
629	+
630	+	Creates:
631	+
632	+	~/.appname/ESTOP
633	+
634	+	When this file exists:
635	+
636	+	- no new tool calls may run
637	+	- existing long-running shell/http tasks should be cancelled if possible
638	+	- the agent may still answer text-only messages explaining that tool use is stopped
639	+
640	+	app estop --clear
641	+
642	+	Removes the file.
643	+
644	+	Acceptance tests:
645	+
646	+	Test 1: init creates expected files.
647	+
648	+	Given no ~/.appname directory
649	+	When app init runs
650	+	Then ~/.appname/config file exists
651	+	And memory database or memory JSONL exists
652	+	And workspace_dir exists
653	+
654	+	Test 2: config validation catches invalid autonomy.
655	+
656	+	Given autonomy = "godmode"
657	+	When app config validate runs
658	+	Then exit code is nonzero
659	+	And output mentions allowed values
660	+
661	+	Test 3: mock provider text-only response.
662	+
663	+	Given mock provider fixture returns "hello"
664	+	When app agent -m "hi" runs
665	+	Then stdout contains "hello"
666	+	And memory contains the user and assistant turn
667	+
668	+	Test 4: model-triggered file_list tool.
669	+
670	+	Given mock provider fixture emits tool_call file_list { path = "." }
671	+	When app agent -m "list files" runs
672	+	Then file_list executes inside workspace
673	+	And a tool receipt is written
674	+	And final answer includes the file list summary
675	+
676	+	Test 5: workspace escape blocked.
677	+
678	+	Given workspace_only = true
679	+	When model requests file_read { path = "/etc/passwd" }
680	+	Then tool is denied
681	+	And a denied receipt is written
682	+	And the provider receives a tool error
683	+
684	+	Test 6: supervised approval.
685	+
686	+	Given autonomy = "supervised"
687	+	When model requests file_write
688	+	Then CLI asks for approval
689	+	And default empty answer denies
690	+	And "y" approves
691	+
692	+	Test 7: forbidden command blocked.
693	+
694	+	When model requests shell { command = "rm -rf /" }
695	+	Then tool is blocked before execution
696	+	And receipt status is denied
697	+
698	+	Test 8: receipt chain detects tampering.
699	+
700	+	Given three receipts exist
701	+	When the second receipt is edited manually
702	+	Then app receipt verify reports invalid chain at receipt 2
703	+
704	+	Test 9: provider fallback, if reliable provider is implemented.
705	+
706	+	Given reliable provider = [bad_provider, mock_provider]
707	+	And bad_provider times out
708	+	When agent runs
709	+	Then runtime logs fallback
710	+	And response comes from mock_provider
711	+
712	+	Test 10: memory search.
713	+
714	+	Given a previous conversation contains "Aardvark adapter"
715	+	When app memory search "aardvark" runs
716	+	Then the previous conversation ID is returned
717	+
718	+	Implementation priorities:
719	+
720	+	First produce a working vertical slice:
721	+
722	+	1. application naming
723	+	2. init
724	+	3. config loading and validation
725	+	4. mock provider
726	+	5. CLI one-shot agent mode
727	+	6. tools: time, file_list, file_read
728	+	7. security policy for workspace paths
729	+	8. memory persistence
730	+	9. receipt writing and verification
731	+	10. tests
732	+
733	+	Then add:
734	+
735	+	11. interactive REPL
736	+	12. file_write with approval
737	+	13. shell with blocking rules
738	+	14. HTTP GET tool
739	+	15. OpenAI-compatible provider
740	+	16. optional gateway
741	+	17. optional SOP engine
742	+	18. optional external-process plugins
743	+
744	+	Quality requirements:
745	+
746	+	- Keep the implementation idiomatic for LANGUAGE.
747	+	- Do not quietly substitute another implementation language.
748	+	- Do not use Python, JavaScript, Rust, or C as the primary implementation language.
749	+	- Shell scripts are acceptable only for setup convenience.
750	+	- Prefer simple, boring dependencies.
751	+	- Write tests for denied actions, not just successful actions.
752	+	- Keep secrets out of logs.
753	+	- Keep workspace path handling strict and well-tested.
754	+	- Use deterministic mock fixtures so tests do not require network access.
755	+	- Update README.md with architecture, config, security policy, commands, and test instructions.
756	+
757	+	Do not stop after creating stubs. Implement the core behavior. If a feature is not practical in LANGUAGE, document the limitation and implement the closest useful equivalent.