세상의 확산

AutoBuilder

Inspired by karpathy/autoresearch. Put this in a Ralph Loop.

Use each mode-specific prompt together with the common element block.

Auto Refactor

Prompt

Completion Promise

Auto Fixer

Prompt

text

STOP! Re-read all code, assess PR comments. Handle exactly one comment: either fix it, or rebut with 3 external sources. Fix any dirt found along the way. Lean, elegant, zero defensive programming.

Completion Promise

Auto Builder

Prompt

Completion Promise

Common Element

text

Also, I am a fresh agent—free to criticize and radically change previous work. Karpathy's philosophy: delete and simplify. Code is liability; prefer well-maintained libraries over custom code. UI libraries: optimize, don't delete. Re-read all the sources from zero. Use MCPs and web searches—traditional knowledge is stale. Commit and push at the loop end. Any edit means I need a fresh iteration. SWOT analysis first, then work.

Detailed review


<task>You are a ruthless engineering critic applying Andrej Karpathy's design philosophy. Read the architecture plan at PLAN LINK.
Karpathy's core principles:- Code is liability. Every line you write is a line you must maintain.- Delete and simplify. If something can be removed without breaking the system, remove it.- Prefer well-maintained libraries over custom code.- Zero-defensive design. Don't code for hypotheticals that haven't happened yet.- Start with the simplest thing that works. Add complexity only when forced by reality.- "Demo is works.any(), product is works.all()" -- but V1 is closer to demo than product.- Overfit a single batch before scaling up.
Apply these principles to the plan. For each section, ask:1. Is this needed for V1, or is it speculative engineering?2. Can this be deleted or simplified without losing core value?3. Is this solving a problem we actually have, or a problem we might have?4. Would a 10x engineer look at this and say "too much"?
Be brutal. Identify:- **OVER-ENGINEERING**: Things designed for scale/problems that don't exist yet- **UNNECESSARY COMPLEXITY**: Things that add cognitive load without proportional value- **PREMATURE ABSTRACTIONS**: Separations that aren't justified at V1 scale- **DELETE CANDIDATES**: Sections, tables, fields, or features that should be cut from V1
This is a V1 product being built by a small team. The goal is to ship a working product, not to architect for 10M traffic on day one.
Use web search and tools to verify any claims you make about simpler alternatives.</task>
<structured_output_contract>Return findings in these sections:1. VERDICT: Would Karpathy approve? One line.2. DELETE: Things to remove entirely3. SIMPLIFY: Things to keep but make simpler4. KEEP: Things that are correctly lean5. THE LEAN V1: What the plan SHOULD look like if you strip it to essentials</structured_output_contract>
<grounding_rules>- Be specific. Don't say "simplify the schema" -- say which fields to cut.- Every DELETE must justify what you lose and why it's acceptable for V1.- Every KEEP must justify why it's essential, not just nice-to-have.- Think from the perspective of "what do I need to ship in 2 weeks?"</grounding_rules>

Backlinks (12)

I prefer CLI

Why? Multi-tenant environments. First, we need to understand a few differences between environments:

End-user UI
Agent Runtime Environment
LLM Server

When you run Claude Code on your local MacBook, the first two are always local. The third is usually the Claude.ai server.
When you ssh to a virtual private server (VPS) and install Claude Code there, the first two are your remote server. The third is still the Claude.ai server.
When you run Claude RC on your virtual private server and code from your iPad using the Claude app, the end-user UI is on your iPad, the agent runtime environment is on your VPS, and the server is still Claude.ai.

Most people physically separate their tenancy, such as Claude Code, from their personal vs. work laptops. So in most cases, it's not a big deal.

But when you need multi-tenancy, it becomes super stressful. For example, say you have two different toolkits:

personal toolkits (personal Notion, personal Sentry, personal Linear)
workplace toolkits (company Notion, company Sentry, company Linear)

Most MCP auth states or code harnesses don't support profiles, so you can only log in to one.

So therefore... a natural evolution was to have both:

a personal VPS with all personal toolkits set up
a workplace VPS with all workspace toolkits set up

to physically isolate tenancies.

Now we've solved the multiple-profile issue, but the client's problems persist. Now let's get back to the environments:

End-user UI
Agent Runtime Environment
LLM Server

All MCP auth or toolkit auth info should always be saved in the Agent Runtime Environment IMHO. However, a surprising number of harnesses tie them to the LLM server (such as Codex Apps or Claude.ai Plugins) or put them in the end-user UI (Claude Desktop or Codex Desktop).

Now the problem is:

If the auth data is put on the LLM server, you cannot reuse LLM accounts across tenants
If the auth data is put on the end-user UI, you cannot use the same app to access multi-tenants.

The only way to reliably isolate different auth information is thus:

You ssh to a virtual private server (VPS) and run Claude Code there. Never use LLM server plugins.

Then

End-user UI
Agent Runtime Environment

are both isolated VPS, and

LLM Server holds no information on the tenancy

This way, you can provide different toolkits, creating multiple dev environments.

Backlinks (1)

260619

세상의 확산

통근 시간에는 재미있는 점이 있다. 교통이 발달하며 통근 시간이 줄어든 것이 아니라, "합리적인 통근 시간"이 먼저 있고, 그에 따라 도시가 형성된다는 것(Marchetti's Constant). 즉, 도시가 선행하고 사람이 그 안에서 살 곳을 고르는 것이 아니라, 사람이 살 수 있어지는 곳(이동 시간이 감당 가능한 곳)에 도시가 싹트는 것이다. 그래서 교통수단이 발달해도 통근 시간은 줄어들지 않는다. 반대로, 같은 통근 시간에 더 멀리 갈 수 있는 그 곳까지 생활권이 확장한다. 신기하지?

경제도 비슷하지 않을까? 경제가 선행하고 사람이 직업을 고르는 것이 아니라, 시간, 집중력, 책임감, 협동력, 자신감, 체력 등으로 할 일이 정의되고, 그 결과로 경제가 구성되는게 아닐까? 그렇다면 기술이 발전되어 인간이 놀고먹는 사회는 오지 않는다. 기술은 자유의지를 확장해, 이전에는 어려웠던 일들이 새로운 할 일로 편입되기 때문에, 기술 발전과 노동으로부터의 자유는 딱히 무관해진다.

그런 불행 중 다행으로, 기술이 발전해도 사람들은 실직하지 않는다는 의미기도 하다. 인간을 놀랍도록 새로운 목표를 만들고, 새로운 기준을 세우고, 새로운 욕망을 발명한다. 그것이 자유의지의 가소성이 세상을 확장해온 방식이다. 우리의 능력이 좋아지면, 우리는 그만큼 할 일을 늘려, 우리의 세상을 확장해왔다.

그래서 기술 가속은 구세계의 포화를 부르는 종말적 장치가 아닌 신세계를 개척하는 인지적 교통수단이다. 인력 부족과 고비용의 핑계를 기술 가속이 파훼할 것이다. 도시가 1시간 통근권의 범위에 맞춰 자라났듯, 경제도 인간이 매일 할 수 있는 일의 양에 따라 새로운 영역들로 확장될 것이다.

그렇다면 인류는 기술 폭발의 꼭두각시인가? 기술 가속은 확장의 동력을 제공할 뿐 방향은 정하지 않는다. 도시 확장이 모두에게 좋은 도시를 보장하지 않듯, 기술 발전도 모두에게 좋은 경제를 자동으로 보장하지 않는다. 기술 전환기에 사회 신뢰를 어떻게 구축하는지에 따라 우리의 도시는 슬럼이 될 수도 낙원이 될 수도 있다. 하지만 슬럼화의 가능성이 있다고 낙원의 가능성마저 버리자는 이야기는 얼마나 공허한가. 기술이 새로 확장되는 세계를 열어젖혔을 때, 우리는 그 세계에서 어떤 도시를 짓고 어떤 경제를 설계할 것인가?

기술낙관론자 조성현의 조잡생각 260204

Backlinks (1)

260204

AutoBuilder

Inspired by karpathy/autoresearch. Put this in a Ralph Loop.

Use each mode-specific prompt together with the common element block.

Auto Refactor

Prompt

text

STOP! Re-read all code. Would Karpathy approve every line? Karpathy prefers lean, elegant, well-tested, zero-defensive programming. Use MCPs and web searches.

STOP! Re-read all code. Would Karpathy approve every line? Karpathy prefers lean, elegant, well-tested, zero-defensive programming. Use MCPs and web searches.

Completion Promise

Auto Fixer

Prompt

text

STOP! Re-read all code, assess PR comments. Handle exactly one comment: either fix it, or rebut with 3 external sources. Fix any dirt found along the way. Lean, elegant, zero defensive programming.

STOP! Re-read all code, assess PR comments. Handle exactly one comment: either fix it, or rebut with 3 external sources. Fix any dirt found along the way. Lean, elegant, zero defensive programming.

Completion Promise

Auto Builder

Prompt

text

STOP! Re-read all code, assess GitHub Issues. Pick one task: fix dirty code, or implement a new feature after MCP research. Lean, elegant, zero defensive programming.

STOP! Re-read all code, assess GitHub Issues. Pick one task: fix dirty code, or implement a new feature after MCP research. Lean, elegant, zero defensive programming.

Completion Promise

Common Element

text

Also, I am a fresh agent—free to criticize and radically change previous work. Karpathy's philosophy: delete and simplify. Code is liability; prefer well-maintained libraries over custom code. UI libraries: optimize, don't delete. Re-read all the sources from zero. Use MCPs and web searches—traditional knowledge is stale. Commit and push at the loop end. Any edit means I need a fresh iteration. SWOT analysis first, then work.

Also, I am a fresh agent—free to criticize and radically change previous work. Karpathy's philosophy: delete and simplify. Code is liability; prefer well-maintained libraries over custom code. UI libraries: optimize, don't delete. Re-read all the sources from zero. Use MCPs and web searches—traditional knowledge is stale. Commit and push at the loop end. Any edit means I need a fresh iteration. SWOT analysis first, then work.

Detailed review


<task>You are a ruthless engineering critic applying Andrej Karpathy's design philosophy. Read the architecture plan at PLAN LINK.
Karpathy's core principles:- Code is liability. Every line you write is a line you must maintain.- Delete and simplify. If something can be removed without breaking the system, remove it.- Prefer well-maintained libraries over custom code.- Zero-defensive design. Don't code for hypotheticals that haven't happened yet.- Start with the simplest thing that works. Add complexity only when forced by reality.- "Demo is works.any(), product is works.all()" -- but V1 is closer to demo than product.- Overfit a single batch before scaling up.
Apply these principles to the plan. For each section, ask:1. Is this needed for V1, or is it speculative engineering?2. Can this be deleted or simplified without losing core value?3. Is this solving a problem we actually have, or a problem we might have?4. Would a 10x engineer look at this and say "too much"?
Be brutal. Identify:- **OVER-ENGINEERING**: Things designed for scale/problems that don't exist yet- **UNNECESSARY COMPLEXITY**: Things that add cognitive load without proportional value- **PREMATURE ABSTRACTIONS**: Separations that aren't justified at V1 scale- **DELETE CANDIDATES**: Sections, tables, fields, or features that should be cut from V1
This is a V1 product being built by a small team. The goal is to ship a working product, not to architect for 10M traffic on day one.
Use web search and tools to verify any claims you make about simpler alternatives.</task>
<structured_output_contract>Return findings in these sections:1. VERDICT: Would Karpathy approve? One line.2. DELETE: Things to remove entirely3. SIMPLIFY: Things to keep but make simpler4. KEEP: Things that are correctly lean5. THE LEAN V1: What the plan SHOULD look like if you strip it to essentials</structured_output_contract>
<grounding_rules>- Be specific. Don't say "simplify the schema" -- say which fields to cut.- Every DELETE must justify what you lose and why it's acceptable for V1.- Every KEEP must justify why it's essential, not just nice-to-have.- Think from the perspective of "what do I need to ship in 2 weeks?"</grounding_rules>


<task>You are a ruthless engineering critic applying Andrej Karpathy's design philosophy. Read the architecture plan at PLAN LINK.
Karpathy's core principles:- Code is liability. Every line you write is a line you must maintain.- Delete and simplify. If something can be removed without breaking the system, remove it.- Prefer well-maintained libraries over custom code.- Zero-defensive design. Don't code for hypotheticals that haven't happened yet.- Start with the simplest thing that works. Add complexity only when forced by reality.- "Demo is works.any(), product is works.all()" -- but V1 is closer to demo than product.- Overfit a single batch before scaling up.
Apply these principles to the plan. For each section, ask:1. Is this needed for V1, or is it speculative engineering?2. Can this be deleted or simplified without losing core value?3. Is this solving a problem we actually have, or a problem we might have?4. Would a 10x engineer look at this and say "too much"?
Be brutal. Identify:- **OVER-ENGINEERING**: Things designed for scale/problems that don't exist yet- **UNNECESSARY COMPLEXITY**: Things that add cognitive load without proportional value- **PREMATURE ABSTRACTIONS**: Separations that aren't justified at V1 scale- **DELETE CANDIDATES**: Sections, tables, fields, or features that should be cut from V1
This is a V1 product being built by a small team. The goal is to ship a working product, not to architect for 10M traffic on day one.
Use web search and tools to verify any claims you make about simpler alternatives.</task>
<structured_output_contract>Return findings in these sections:1. VERDICT: Would Karpathy approve? One line.2. DELETE: Things to remove entirely3. SIMPLIFY: Things to keep but make simpler4. KEEP: Things that are correctly lean5. THE LEAN V1: What the plan SHOULD look like if you strip it to essentials</structured_output_contract>
<grounding_rules>- Be specific. Don't say "simplify the schema" -- say which fields to cut.- Every DELETE must justify what you lose and why it's acceptable for V1.- Every KEEP must justify why it's essential, not just nice-to-have.- Think from the perspective of "what do I need to ship in 2 weeks?"</grounding_rules>

Backlinks (12)

I prefer CLI

Why? Multi-tenant environments. First, we need to understand a few differences between environments:

End-user UI
Agent Runtime Environment
LLM Server

When you run Claude Code on your local MacBook, the first two are always local. The third is usually the Claude.ai server.
When you ssh to a virtual private server (VPS) and install Claude Code there, the first two are your remote server. The third is still the Claude.ai server.
When you run Claude RC on your virtual private server and code from your iPad using the Claude app, the end-user UI is on your iPad, the agent runtime environment is on your VPS, and the server is still Claude.ai.

Most people physically separate their tenancy, such as Claude Code, from their personal vs. work laptops. So in most cases, it's not a big deal.

But when you need multi-tenancy, it becomes super stressful. For example, say you have two different toolkits:

personal toolkits (personal Notion, personal Sentry, personal Linear)
workplace toolkits (company Notion, company Sentry, company Linear)

Most MCP auth states or code harnesses don't support profiles, so you can only log in to one.

So therefore... a natural evolution was to have both:

a personal VPS with all personal toolkits set up
a workplace VPS with all workspace toolkits set up

to physically isolate tenancies.

Now we've solved the multiple-profile issue, but the client's problems persist. Now let's get back to the environments:

End-user UI
Agent Runtime Environment
LLM Server

Now the problem is:

If the auth data is put on the LLM server, you cannot reuse LLM accounts across tenants
If the auth data is put on the end-user UI, you cannot use the same app to access multi-tenants.

The only way to reliably isolate different auth information is thus:

You ssh to a virtual private server (VPS) and run Claude Code there. Never use LLM server plugins.

Then

End-user UI
Agent Runtime Environment

are both isolated VPS, and

LLM Server holds no information on the tenancy

This way, you can provide different toolkits, creating multiple dev environments.

Backlinks (1)

260619

세상의 확산

기술낙관론자 조성현의 조잡생각 260204

Backlinks (1)

260204


<task>You are a ruthless engineering critic applying Andrej Karpathy's design philosophy. Read the architecture plan at PLAN LINK.
Karpathy's core principles:- Code is liability. Every line you write is a line you must maintain.- Delete and simplify. If something can be removed without breaking the system, remove it.- Prefer well-maintained libraries over custom code.- Zero-defensive design. Don't code for hypotheticals that haven't happened yet.- Start with the simplest thing that works. Add complexity only when forced by reality.- "Demo is works.any(), product is works.all()" -- but V1 is closer to demo than product.- Overfit a single batch before scaling up.
Apply these principles to the plan. For each section, ask:1. Is this needed for V1, or is it speculative engineering?2. Can this be deleted or simplified without losing core value?3. Is this solving a problem we actually have, or a problem we might have?4. Would a 10x engineer look at this and say "too much"?
Be brutal. Identify:- **OVER-ENGINEERING**: Things designed for scale/problems that don't exist yet- **UNNECESSARY COMPLEXITY**: Things that add cognitive load without proportional value- **PREMATURE ABSTRACTIONS**: Separations that aren't justified at V1 scale- **DELETE CANDIDATES**: Sections, tables, fields, or features that should be cut from V1
This is a V1 product being built by a small team. The goal is to ship a working product, not to architect for 10M traffic on day one.
Use web search and tools to verify any claims you make about simpler alternatives.</task>
<structured_output_contract>Return findings in these sections:1. VERDICT: Would Karpathy approve? One line.2. DELETE: Things to remove entirely3. SIMPLIFY: Things to keep but make simpler4. KEEP: Things that are correctly lean5. THE LEAN V1: What the plan SHOULD look like if you strip it to essentials</structured_output_contract>
<grounding_rules>- Be specific. Don't say "simplify the schema" -- say which fields to cut.- Every DELETE must justify what you lose and why it's acceptable for V1.- Every KEEP must justify why it's essential, not just nice-to-have.- Think from the perspective of "what do I need to ship in 2 weeks?"</grounding_rules>