Research Methodology

What makes CREATE SOMETHING different from AI blogs: we don't just document results—we document the process of building with AI agents.

Every experiment is tracked with automated logging, real costs from APIs, precise time measurements, and intervention documentation. This transforms anecdotes into reproducible experiments.

How We Work

1. Build

Claude Code

Work with AI agents as development partners

2. Track

Auto-Log

Hooks capture every prompt, error, intervention

3. Analyze

Real Data

Actual costs, time, errors from APIs

4. Publish

Honest Results

What worked, what didn't, and why

This is research, not blogging.

Every Experiment Tracked With

Prompts

Real-time logging via Claude Code hooks

47 iterations logged

Errors

Precise counts & resolution times

23 errors, avg fix: 8 min

Costs

Token usage + infrastructure from APIs

$18.50 Claude + $8.30 Cloudflare

Interventions

When AI needed human help, and why

12 manual fixes documented

Time

Session duration, not guesswork

26 hours actual vs 120 estimated

Architecture

Decisions made, alternatives considered

Workflows over Workers (why)

Three Tracking Modes

Real-Time Tracking

Ideal

Start tracking from day one. Get complete data on every iteration, error, and decision.

Data Quality:

High confidence, precise metrics

Use Case:

New experiments starting from scratch

Mid-Flight Tracking

Practical

Start tracking on an in-progress project. Combine real-time data with git history reconstruction.

Data Quality:

Mixed: estimates for past work, precise for future

Use Case:

Active projects you realize are experiment-worthy

Retroactive Documentation

Still Valuable

Document already-deployed projects. Reconstruct from git, APIs, and memory.

Data Quality:

Lower confidence, acknowledged limitations

Use Case:

Completed projects with production data

Why This Matters

Without Tracking

  • "I built X with AI" (anecdote)
  • No reproducibility
  • Can't verify claims
  • Just another AI blog

With Tracking

  • "I built X: 26 hrs, $27, 78% savings" (data)
  • Others can replicate experiments
  • Transparent methodology
  • Scientific research platform

The tracking methodology transforms "prompting and hoping" into systematic evaluation with reproducible results. This is what separates research from blogging.

For Researchers: Use This Methodology

Want to adopt this approach for your own AI-native development research? The experiment tracking system is available as a Claude Code Skill.

1

Install the Skill

Add experiment tracking to your Claude Code setup

2

Build & Track

Work with Claude Code while automatic logging captures everything

3

Generate Papers

Transform tracked data into reproducible research

Methodology in Action

Example from Experiment #1: Zoom Transcript Automation

26
Hours
47
Errors
12
Interventions
78%
Time Savings

Data sources: Real-time prompt logging via hooks, Claude Code Analytics API, Cloudflare billing API, git commit history

Reproducibility: Starting prompt, tracking logs, and architecture decisions documented