If you’ve been following my Joshua8.AI experiments, you know I’ve been humiliating frontier LLMs with thinking games. It started with a blog post showing top models failing miserably to guess a 3-digit PIN—a simple brute-force puzzle any child could solve. That post exploded and sparked hundreds of conversations about AI hype versus reality.
Then came Battleship.
After five full games against Qwen3-30B-VL running locally on my RTX 5090, the score stood at Human 5 – AI 0.
The AI improved when I had Claude Opus 4.5 coach it on strategy—finish sinking a ship before hunting elsewhere—but it still clustered shots inefficiently, abandoned partial hits, and never fully covered the board.
Entertaining? Absolutely. But here’s the bombshell:
The AI lost every thinking game. But it won at coding.
The entire Battleship experience—React web app with 10×10 grids, saved games, resume functionality, sunk notifications, dynamic visuals, and a tool-augmented AI opponent—was one-shotted by Claude Opus 4.5 and its swarm of agents. One prompt. A polished, functional application in minutes. No coding from me. No debugging.
This is agentic coding: AI that doesn’t just generate text, but autonomously builds real software using tools, reasoning, iteration, and specialized sub-agents. For Small to Medium Sized Businesses (SMBs), it’s a game-changer.
Why Agentic Coding Matters for SMBs
Traditional custom software development is slow, expensive, and risky. Months of work. $30k–$100k budgets. Endless scope creep.
With agentic AI, you can prototype production-grade tools in hours. My Battleship app included:
- Clean React frontend with interactive grids
- Persistent game state (save and resume)
- Automatic sunk-ship detection and visual feedback
- A local LLM opponent that could actually find ships (sinking them was another story)
All from one prompt. That’s not incremental progress—it’s a 10× leap in speed-to-value.
The Secret: Systems-Level Specification
One-shot success requires crystal-clear specifications—almost systems engineering level. You need to define:
- Inputs: What data will the AI receive?
- Behavior: What should it do? What are the edge cases?
- Output format: File structure, code style, documentation standards
Without this precision, even the best models hallucinate or miss critical requirements. Tools like GitHub SpecKit help create structured prompts. A well-defined system spec with precise schemas, error-handling rules, and acceptance criteria turns a vague request into a reliable one-shot win.
What This Means for Small Business
Rapid Prototyping Without Dev Teams
A boutique retailer could prompt: “Build a local RAG system that reads my inventory Excel and predicts stockouts.” With a tight spec, Claude generates the code, integrates pandas, and deploys it offline. No cloud fees. No data leaks. No vendor lock-in.
Custom Agents That Actually Finish
My Battleship opponent found ships but rarely sank them—exactly what most off-the-shelf AI agents do with leads or customer issues. They identify opportunities, then abandon follow-through.
Agentic coding lets you build custom loops: “If a lead engages, exhaust adjacent actions (email, call, text) before moving on.” One client, a small construction firm, now uses a bid-estimating agent that “hunts” material costs, “finishes” with risk-adjusted pricing, and outputs complete reports. Result: 30% faster bids, 15% higher win rates.
Cost and Risk Reduction
My entire machine cost ~$5k (mostly GPUs at inflated prices). That’s less than a single month of many enterprise cloud AI subscriptions. Local deployment means infinite usage, zero token costs, and total data privacy.
Training and Knowledge Capture
Instead of expensive external courses, prompt Claude to create tailored bootcamps from your internal documents: “Generate a RAG tutorial for my team based on our CRM notes.” Done in one shot.
The Catch (and How to Mitigate It)
Agentic AI can hallucinate bugs or produce messy code. That’s why I test everything locally in sandboxes and run production on private hardware. SMBs should start small:
- Identify one high-value workflow (inventory, lead follow-up, reporting)
- Prototype with Claude or similar agentic tools
- Deploy locally with vLLM and simple integrations
- Iterate with human oversight
The Bottom Line
AI loses at thinking games like Battleship. It clusters shots, forgets partial hits, lacks the spatial intuition a ten-year-old develops naturally.
But that same AI built the entire game while I was still thinking about the rules.
Claude Opus 4.5 didn’t just write code—it built a complete, interactive application in one prompt. That level of acceleration is available to small businesses right now, especially when paired with rigorous system-level specifications.
The lesson isn’t that AI is smart or dumb. It’s that AI is differently capable—spectacular at structured generation, mediocre at open-ended reasoning. The businesses that thrive will be those who understand this distinction and deploy AI where it excels.
If you’re tired of slow, expensive tech and overhyped cloud AI, agentic coding is your unfair advantage. Build fast. Stay private. Keep control.