独立开发者的 AI 工具箱 2026：我实际在用的那些

网上的 AI 工具推荐文章有两个共同问题：要么是推广，要么是作者根本没深度用过。

这篇不一样——全部是我每天在用、付费在用、踩过坑的工具。每个工具我都说清楚为什么用和什么情况下不用。

编程类

Claude Code（核心主力）

用途：几乎所有编程任务的执行层。

我的工作流：Claude Code 作为常驻终端，配合 CLAUDE.md 记忆体系，能持续接续上下文。不只是”帮我写代码”，更是”帮我管理一个工程项目”。

真正让它好用的不是 Claude 有多聪明，而是 Hooks 系统——给 AI 加上审计层之后，我才真正放心让它执行写文件、运行命令这类操作。

收费：$20/月（Claude Pro），值。

什么时候不用：纯探索性的对话、不需要操作文件系统的问题，用 Claude.ai 网页版即可，不消耗 Claude Code quota。

Cursor（备用）

用途：需要在 IDE 里直接看效果时。

Cursor 的优势是和代码编辑器深度集成——可以直接在编辑器里接受/拒绝改动，视觉上更直观。

什么时候不用：我大多数工作在终端完成，Cursor 的图形界面反而是额外认知负担。如果你是 GUI 优先的开发者，它比 Claude Code 更顺手。

GitHub Copilot（已弃用）

曾经用了两年。在 Claude Code 出来之后，基本没打开过。

原因：Copilot 的补全模式适合”知道方向、需要加速”的场景；Claude Code 的对话模式适合”需要 AI 帮你决策”的场景。我现在更多后者。

模型层

Claude Sonnet 4.6（日常主力）

用途：90% 的编程任务、内容生成、分析推理。

速度和质量的最佳平衡点。Opus 更聪明但慢 3 倍，Haiku 更快但质量下降明显。Sonnet 是我的默认选择。

Qwen3-235B（本地高质量推理）

用途：需要本地运行的高质量推理任务。

在 Mac Studio 上通过 omlx 运行，4bit 量化，8.5 分（我的主观评分），接近 Claude 的质量。用于不想把数据发到云端的场景——内部文档分析、本地 Agent 任务。

详见：MLX 本地模型运行实践。

什么时候不用：需要快速响应时，本地模型延迟明显高于云端 API。

MiniMax M2.5（高性价比云端）

用途：中等复杂度任务的云端补充，成本是 Claude 的 1/5。

我用它跑 Liaison Agent（飞书消息处理），响应速度够快，中文理解好，适合高频低复杂度的 Agent 任务。

效率工具

Raycast AI（日常问答）

用途：碎片化问题的快速答案，不想开浏览器时。

⌘ Space → 打字 → 得到答案。比开 Claude.ai 快 10 秒。别小看这 10 秒，一天省下来很可观。

Context7 MCP（实时文档）

用途：在 Claude Code 里获取最新的库文档。

MCP 协议注册后，Claude 遇到不确定的 API 会自动拉取最新文档，而不是靠训练数据里的旧知识。

Astro 5、Tailwind v4、Vercel 最新配置——全靠它保持准确。

Perplexity（实时信息）

用途：需要引用来源的实时信息查询。

和 Claude 的区别：Perplexity 适合”这件事目前是什么状态”（实时+引用），Claude 适合”帮我分析这件事”（推理+生成）。

内容创作

Claude.ai（内容生成）

用途：博客文章的初稿、结构设计、文案打磨。

这个站点上的大多数文章，都经历了这样的流程：我写提纲 → Claude 扩写初稿 → 我大幅修改和注入个人经历 → 发布。

关键：Claude 的初稿通常太完整、太正确、太无趣。我的工作是把自己真实的声音注入进去。

Notion AI（笔记整理）

用途：把乱糟糟的笔记整理成结构化内容。

不用它写新东西，只用它整理已有的内容。它在 Notion 环境里的集成体验很好。

工具选择的几个原则

1. 工具层和模型层分开评估

Claude Code 好用不等于 Claude 模型最好。工具层（工作流集成）和模型层（推理质量）是两个维度，分开选择。

2. 深度用两三个，胜过浅度用十个

我试过很多工具，现在日常用的不超过 8 个。每个都用到很深，理解它的边界在哪里。

3. 本地 vs 云端：数据敏感度决定

财务数据、私人文件、内部代码——走本地模型。公开内容、一般任务——走云端 API。不要教条，按数据敏感度分类。

4. 工具成本要放进总体 ROI 里算

我每月在 AI 工具上花约 $40（Claude Pro + Perplexity Pro）。对应省下的时间，ROI 轻松超过 10x。但前提是真的深度用了，不是买了放着。

2026 年值得关注的方向

Agent 基础设施：不只是单个 AI 工具，而是多个 Agent 协作的系统。我已经在跑的这套 30+ 微服务系统是一个方向。

本地模型实用化：Qwen3-235B 4bit 能在 Mac 上跑到接近 Claude 的质量，这个趋势会继续。数据主权会成为越来越多人的考量。

工具收敛：AI 工具的数量会从爆炸式增长转向收敛，真正有深度的工具会胜出。

Every few months someone publishes another “Top 10 AI Tools” roundup. This isn’t that.

This is the stack I actually use, as of early 2026 — tools that survived real work, not demos. I’ll explain why each one made the cut, what I use it for, and what I dropped.

Coding

Claude Code (primary)

My main coding environment. What keeps me here: the CLAUDE.md memory system. I can write project-specific context — architecture decisions, naming conventions, constraints — and Claude Code reads it every session. The agent doesn’t start fresh every time.

The Hooks system adds a safety layer: I can define pre/post rules for file writes and commands. That matters when you’re running an agent that executes code autonomously.

Cost: $20/mo on the Pro plan. For the hours it saves, that’s not a debate.

Cursor (backup)

Better if you’re IDE-first. The inline editing flow feels more natural if you spend most of your time in a file tree. I keep it installed; I use it when Claude Code is overkill for a quick edit.

GitHub Copilot (abandoned)

I used it for a year. Claude Code replaced it entirely. The autocomplete is fine, but the whole-session context that Claude Code provides is a different category of tool.

Model Layer

The tool layer and the model layer are separate decisions. Same tool, different model = different results.

Claude Sonnet 4.6 (default)

Handles 90% of my tasks. Best speed-to-quality ratio I’ve found. Fast enough for interactive use, good enough for complex reasoning. I don’t reach for anything else unless I have a specific reason.

Qwen3-235B (local, via omlx)

Running 4-bit quantized on a Mac Studio. Quality benchmark: 8.5/10 — genuinely competitive with cloud frontier models for most tasks.

Why local: data sensitivity. Anything involving private business data, client information, or unreleased code goes through the local model. Not because I distrust the cloud providers, but because clean data hygiene is easier to maintain if it’s a rule, not a judgment call.

MiniMax M2.5 (cloud)

About 1/5 the cost of Claude at comparable quality for simpler tasks. I use it for high-frequency, low-complexity agent work — things like routing messages, summarizing logs, triaging inboxes. Running these through Claude would work fine; it would also cost 5x more.

Productivity

Raycast AI

Cmd+Space, ask a question, get an answer, keep working. No context switching to a browser tab. I use it for quick lookups, unit conversions, regex checks — things that used to cost 30 seconds of tab-switching that now cost 3 seconds.

Context7 MCP

Plugs into Claude Code via the MCP protocol. Pulls real-time library documentation into the coding session. When I’m working with a framework I don’t know well, this is the difference between AI that hallucinates outdated API signatures and AI that gives me the current one.

Perplexity

My replacement for passive news consumption. I use it for directed searches with citations — “what changed in X in the last 3 months,” “compare approaches A and B.” I’m looking for specific information, not a feed.

Content

Claude.ai

Article drafts. I describe the structure and key points; Claude produces a draft; I rewrite heavily. The draft is a scaffold, not a finished product. The final text is usually 60%+ original by the time I’m done.

What Claude is bad at: knowing what I actually think. It will produce plausible-sounding opinions that aren’t mine. That’s the thing I edit out.

Notion AI

Organizing existing notes, not generating new content. Good at pulling structure out of a messy brain dump. I don’t use it for writing because the output sounds like Notion AI.

How I Think About Tool Selection

Evaluate tool layer and model layer separately. Claude Code + Qwen3-235B is a valid combination. The interface question and the model question have different answers.

Go deep on 2-3 tools, not shallow on 10. Every tool has a learning curve before you hit the productivity payoff. Spreading across too many tools means you never get past the curve on any of them.

Local vs cloud: decide by data sensitivity, not by capability. For most tasks, the cloud models are good enough. The question is what’s in the prompt.

Calculate real ROI. Claude Code at $20/mo saves me several hours a week. That math is easy. A tool that costs $0 but adds friction isn’t free.

What I’m Watching in 2026

Agent infrastructure is maturing. Running agents that operate autonomously across sessions — with memory, hooks, and tool access — is becoming practical. Not just chatbots.

Local models are actually usable now. Qwen3-235B at 4-bit would have been implausible 18 months ago. The gap between local and cloud is closing faster than expected.

Tool consolidation is coming. The current landscape has too much overlap. Some of these tools won’t exist in their current form in two years. I’m betting on tools with strong infrastructure bets (MCP, memory systems) over tools with clever UI.

The stack will look different in a year. That’s fine. The point isn’t to find the permanent answer — it’s to make deliberate choices and update them when the evidence changes.

编程类