Claude vs. Codex vs. Human Dev: A Reality Check

Last Refreshed //

I have to admit I’m in awe of the Twitter/X developers who proudly announce they don’t write or review LLM-generated code—they just manage 10, 20, or 100 agents.

Meanwhile, I use LLMs for what they’re good at: bug fixing and wrestling with CSS.

For everything else, I still roll up my sleeves.

image

With some free time, I decided to test how these "magical" tools handle a greenfield project.
For existing projects with well-defined structures, they add features smoothly. But for something new? Let’s just say the results were educational.

The Challenge

Date: March 2026
Contenders: Claude Sonnet vs. Codex GPT 5.3
Task: Build a simple Angular + Spring Boot application with a database and role-based access.
Referee: Mistral AI (for neutral specs)
Rules: No custom instructions, no hand-holding—just raw, unfiltered LLM output.

You can find the specification used at the end of the post.

Building and Executing the Application

Both LLMs took 20–30 minutes to generate the code. The build failed immediately.

  • Claude struggled with Angular syntax and database naming conventions.
  • Codex had fewer issues but still hit snags with Spring AI compatibility.
  • Verdict: Codex was smoother, but neither LLM produced a working app out of the box.

Versioning: A Dumpster Fire

Java 25

  • Requested: Java 25
  • Codex: Followed instructions.
  • Claude: Ignored the request entirely. Bad LLM.

Spring Boot

  • Requested: Latest release (4.0.3 as of March 2026)
  • Codex: 3.5.0
  • Claude: 3.4.0
  • Reality: Both used outdated, unpatched versions—a security risk.

Angular

  • Requested: Angular 21
  • Codex: Angular 18 (with deprecated syntax like *ngIf)
  • Claude: Angular 19 (using modern @if)
  • At least they used standalone components.

Spring AI: The Biggest Pain Point

  • Requested: Spring AI 2 (SNAPSHOT, but compatible with Spring Boot 4)
  • Claude: Downgraded to Spring AI 1.0.0-M6 (a pre-release, not production-ready).
  • Codex: Settled on Spring AI 1.0.3 (stable but outdated).
  • Result: 20+ minutes wasted on dependency hell.

On the left you have Codex config, on the right Claude:

The POM for the backend:

image

Angular package.json:

image

Code Structure: Codex Wins (Mostly)

  • Codex: Used a feature-based structure (e.g., user/, order/). This aligns with professional projects.
  • Claude: Opted for "Clean Architecture" (package by function). Not scalable for large projects.

Claude, on the right, doesn't care about domains. No instruction was given (no need to complain).

image

Configuration: Claude Did Better

  • Format: Both used YAML (I’d have preferred .properties, but it wasn’t specified).
  • Claude: Created three application.yml files (dev, prod, generic).
  • Codex: Dumped everything into one file.
  • Frontend: Claude used a proxy; Codex relied on environment variables. Different approaches, same result.

image

Code Quality: A Mixed Bag

Backend

  • Claude: Used @Autowired (unacceptable even for juniors).
  • Both: Used @Value (discouraged by the Spring team).
  • Both: Avoided exposing entities (used DTOs/records). Good practice, surprisingly.

Frontend

  • Codex: Used Angular signals (modern).
  • Claude: Stuck to older patterns.

Claude had more difficulties with standard Angular and D3js, it had a lot of pain with its own code:

  • it cannot build it

image

  • it understood that it did a mess

image

Security: Basic but Flawed

  • Both: Implemented Spring Security correctly (basic auth, roles).
  • Codex: Allowed all REST methods (*).
  • Claude: Restricted to GET, POST, PUT, DELETE, OPTIONS.
    • Risk: If you use PATCH, Claude’s setup might break.

Tests: Minimalist (and Useless)

  • Both: Created basic, similar tests.
  • Codex: Left an empty test package.
  • Verdict: Not production-ready.

Database: Hardcoded Passwords? Really?

  • Claude: Initialized records directly in the code, including hardcoded passwords.
    • Excuse: "Password encoding required it."
    • Reality: Security 101 violation.

image

UI: Subjective, but Claude Wins

  • Claude: More polished, professional.
  • Codex: Felt like a basic prototype.
  • Note: This is subjective, but Claude’s UI was cleaner in past projects too.

Here Claude, professional:

image

image

image

Here Codex, from the beach on holiday:

image

image

Verdict: LLMs Are Assistants, Not Replacements

The Good

  • Prototyping: LLMs speed up initial setup and boilerplate.
  • Bug Fixing: They’re great for debugging and CSS tweaks.
  • Cost: $20/month is worth it for rapid iteration.

The Bad

  • Versioning: They default to outdated dependencies unless explicitly told otherwise.
  • Quality: Working ≠ production-ready. Senior devs must review every line.
  • Security: Hardcoded credentials and lax REST methods are red flags.

The Ugly

  • Left unsupervised, they produce "working garbage."
  • Codex’s structure was better for large projects.
  • Claude’s UI was more professional.

Lessons Learned

  1. Be Explicit:

    • Specify versions, dependencies, and patterns upfront.
    • Example: "Use Java 25, Spring Boot 4.0.3, and Angular 21."
  2. Review Everything:

    • LLMs hallucinate dependencies and ignore best practices.
    • Critical environments (finance, health): LLM-generated code must be manually verified—no exceptions. The cost of a bug isn’t just debugging time; it’s compliance violations, audits, or worse.
  3. Frontend vs. Backend Quality:

    • Frontend code tends to be higher quality than backend. Likely reason: modern frontend frameworks (Angular, React, Vue) have consistent, opinionated patterns, while backend Java code online is a graveyard of legacy architectures, deprecated practices, and "enterprise" anti-patterns from the last 20 years.
    • Result: LLMs regurgitate old Java habits (@Autowired, XML configs, monolithic structures) unless explicitly forbidden.
  4. Build Your Team:

    • Claude: Frontend developer. Better at UI/UX, modern Angular, and standalone components.
    • Codex: Backend developer. Better at structuring features, Spring Boot, and avoiding spaghetti code.
    • Next in line: Mistral and Alibaba models—benchmarks coming soon.
  5. Skills and Dedicated Agents Improve Quality (At a Cost):

    • Custom skills (e.g., "Use DTOs, avoid @Autowired") reduce garbage output.
    • Dedicated agents (e.g., one for backend, one for frontend) improve focus but add complexity.
    • Trade-off: Better code, but now you’re managing an LLM orchestra instead of writing it yourself.
  6. Price vs. Value:

    • $20/month is justified if it saves hours of boilerplate. In regulated industries, factor in review overhead—the "savings" might evaporate.
    • At the moment Codex offers more tokens that Claude, I tend to use Codex for more tasks, and I limit Claude to complicated UI interfaces. With Claude (even with the 'cheap' Sonnet, we hit the limits quickly).

Now, back to writing actual code.

--

Specification used

Specification created by Mistral AI
Here’s the **final, actionable specification** for your AI to build the application, incorporating all your requirements: 
 
 
## **Final AI Specification: Spring Boot + Angular App with D3.js and Ollama Chatbot** 
 
### **1. Overview** 
- **Purpose**: Professional web app with role-based access, an **interactive D3.js dashboard**, and a **chatbot** (Ollama for local LLM, OpenAI for cloud). 
- **Tech Stack**: 
   - **Backend**: Spring Boot (latest), Java 25, Spring Data JPA, Hibernate, REST API, **Spring AI 2.0 (SNAPSHOT)**. 
   - **Frontend**: Angular (latest), Bootstrap, **D3.js** (interactive chart). 
   - **Database**: H2 (dev), PostgreSQL (prod), **Liquibase** (for migrations and test data). 
   - **Security**: JWT (Spring Security defaults). 
   - **Deployment**: Docker (Angular + Spring Boot; PostgreSQL separate). 
 
 
### **2. Frontend (Angular)** 
#### **Dashboard** 
- **Interactive D3.js Chart**: 
   - Type: Bar/line chart with **tooltips**, **zoom**, and **click events**. 
   - Data: Fetched from /api/dashboard/chart. 
   - Example data structure: 
 
json 
[ 
{ "month": "Jan", "value": 100, "details": "Sample data" }, 
{ "month": "Feb", "value": 150, "details": "Sample data" } 
] 
- **Responsive**: Bootstrap 5 (mobile-friendly). 
 
#### **Chatbot UI** 
- **Features**: 
   - Text input/output. 
   - Toggle to switch between **Ollama (local)** and **OpenAI (cloud)**. 
   - **Conversation history**: Display past messages (fetched from /api/chat/history). 
- **Styling**: Bootstrap modal or sidebar. 
 
 
### **3. Backend (Spring Boot)** 
#### **Dependencies** 
- Use **Spring Initializr** with: 
   - Spring Web, Security, Data JPA, H2, PostgreSQL, Liquibase. 
   - **Spring AI 2.0 SNAPSHOT** (for Ollama/OpenAI integration): 
 
xml 
<dependency> 
<groupId>org.springframework.ai</groupId> 
<artifactId>spring-ai-core</artifactId> 
<version>2.0.0-SNAPSHOT</version> 
</dependency> 
<dependency> 
<groupId>org.springframework.ai</groupId> 
<artifactId>spring-ai-ollama-spring-boot-starter</artifactId> 
<version>2.0.0-SNAPSHOT</version> 
</dependency> 
<dependency> 
<groupId>org.springframework.ai</groupId> 
<artifactId>spring-ai-openai-spring-boot-starter</artifactId> 
<version>2.0.0-SNAPSHOT</version> 
</dependency> 
#### **API Endpoints** 
| Endpoint               | Method | Description                                                                 | 
|------------------------|--------|-----------------------------------------------------------------------------| 
| /api/dashboard/chart | GET    | Returns JSON data for D3.js chart.                                         | 
| /api/chat            | POST   | Accepts { "prompt": "user input", "useCloud": boolean }. Returns LLM response. | 
| /api/chat/history    | GET    | Returns conversation history for the current user.                        | 
 
#### **LLM Configuration** 
- **Ollama (Local)**: 
   - Default URL: http://localhost:11434. 
   - Model: llama3 (or user-specified). 
- **OpenAI (Cloud)**: 
   - API key required (spring.ai.openai.api-key). 
- **Fallback**: If no LLM is configured, return: 
 
json 
{ "response": "LLM not configured. Please check your settings." } 
#### **Database (Liquibase)** 
- **Changelog Example**: 
 
xml 
<changeSet id="1" author="ai"> 
<createTable tableName="chat_history"> 
<column name="id" type="bigint" autoIncrement="true" primaryKey="true"/> 
<column name="user_id" type="bigint"/> 
<column name="prompt" type="varchar(500)"/> 
<column name="response" type="varchar(2000)"/> 
<column name="timestamp" type="timestamp"/> 
<column name="is_cloud" type="boolean"/> 
</createTable> 
</changeSet> 
 
### **4. Sample Tests** 
#### **Liquibase Test Data** 
- Preload H2 with: 
   - 3 users (admin/user/guest). 
   - Sample dashboard data. 
   - Sample chat history. 
 
#### **Spring Boot Tests** 
- **Repository Layer**: @DataJpaTest for ChatHistoryRepository. 
- **Controller Layer**: @WebMvcTest for ChatController (mock LLM responses). 
- **Example Test**: 
 
java 
@SpringBootTest 
class ChatControllerTest { 
@Test 
void testChatEndpoint() { 
// Mock Ollama/OpenAI response 
// Test POST /api/chat 
} 
} 
 
### **5. Docker Deployment** 
#### **App Container** 
- **Multi-stage Dockerfile**: 
 
dockerfile 
# Stage 1: Build Angular 
FROM node:latest AS angular-build 
WORKDIR /app 
COPY frontend/package*.json ./ 
RUN npm install 
COPY frontend . 
RUN npm run build --prod 
 
# Stage 2: Build Spring Boot 
FROM openjdk:25-jdk-slim 
WORKDIR /app 
COPY backend/build/libs/*.jar app.jar 
COPY --from=angular-build /app/dist/frontend ./static 
EXPOSE 8080 
ENTRYPOINT ["java", "-jar", "app.jar"] 
- **Environment Variables**: 
   - SPRING_PROFILES_ACTIVE=prod 
   - OPENAI_API_KEY (optional, for cloud LLM) 
   - OLLAMA_BASE_URL=http://host.docker.internal:11434 (for local LLM) 
 
#### **PostgreSQL** 
- Separate container (not managed by Docker Compose in prod). 
 
 
### **6. AI Instructions** 
1. **Generate Project**: 
   - Use Spring Initializr and Angular CLI. 
   - Add D3.js (npm install d3). 
2. **Implement**: 
   - Interactive D3.js chart with tooltips/zoom. 
   - Chatbot with Ollama/OpenAI toggle and history. 
   - Liquibase migrations for all tables. 
3. **Dockerize**: 
   - Build Angular → Copy to Spring Boot static. 
   - Configure LLM via environment variables. 
 
### **7. Final Notes** 
- **Ollama**: Ensure it’s running locally for dev (docker run -p 11434:11434 ollama/ollama). 
- **OpenAI**: Requires API key for cloud mode. 
- **Conversation History**: Saved to PostgreSQL/H2.