James Wilson

I build deterministic safety systems for AI that serves clinical and justice-impacted populations. Self-hosted, honestly evaluated, and answerable in the failure modes that matter.

I started as a tech in 2001. For thirteen years at Bombardier Technology Solutions I was a mechanical and electrical technician, technical trainer, and subject-matter expert — the person other mechanics called when a locomotive or coach wouldn't run.

From 2014 to 2023 I moved into project management and technical development at the same company, building the systems that protect the people who work around those machines. TrackSafe monitors trains and track workers. YardSafe enforces blue-flag protections in rail yards and won Smart Rail USA's Product of the Year in 2015. Both shipped to carriers including NJ TRANSIT and MARTA. I also ran the e-learning program end-to-end for four major transit agencies — NJ TRANSIT, MARTA, MTA, GO Transit — and led technology deployment for the VP's office. These systems exist because the cost of getting them wrong is measured in people.

Since 2024 I've been applying the same discipline to AI systems for populations where a different kind of harm is at stake. StrongAfter delivers trauma-informed content to male survivors of sexual abuse, with deterministic safety boundaries that run before any language model does. Frankie helps justice-impacted people prepare for job interviews at Columbia University's Justice Through Code program. CasaLingua serves users navigating housing applications in languages other than English. In parallel I teach Applied AI Solutions Engineering at Fair Chance Futures; weekly course notes at class.wize73.com.

The domain changed. The posture didn't. Safety-critical deployment is the same discipline whether you're interlocking signals in a rail yard or gating retrieval in a clinical AI pipeline: deterministic checks first, probabilistic reasoning second, honest evaluation throughout.

9.1%

automated flag

45%

clinical review

5×

gap

At StrongAfter v1.19.2, automated safety metrics flagged 9.1% of responses. Clinical review said 45% warranted intervention. If you've ever watched a self-test pass on a system that was unsafe in the field, you already know what that number means.

01 → work case studies, 2001 → now
02 → teaching 25-year teaching arc · class.wize73.com
03 → lab homelab, self-hosted stack
04 → notes dev log
05 → about bio, education, contact

Work with me

I take on a small number of engagements each year where the work matches the thesis above. Current bandwidth: 1 advisory + 1 build. Typical shapes:

safety review: second-opinion audit of an AI system with real users
build: deterministic safety layer + eval harness, 4–12 weeks
advisory: standing relationship, monthly, clinical or justice AI

Not a fit: general LLM consulting, marketing copy for AI products, anything that ships faster than it evaluates.

Reach me at [email protected]. I read every message and reply within 48h. Preferred: a paragraph on what you're building and what's going wrong.