How it works Stack Pricing Customers Blog Talk to us

evals.

2 posts tagged evals.

14 May 2026

Where eval variance actually lives.

Why we replicate the judge three times per case, what we found when we measured it, and how that reshapes what a meaningful prompt-edit delta looks like.

Read →
14 May 2026

Building a strict benchmark for AI in patent law.

A guest post from the tu-po team on grounding an AI legal agent in EPO Board of Appeal decisions — and what 'correct' even means when the experts disagree.

Read →

← All posts

Talk to us

Have a workflow worth automating?

Send us a sketch and a handful of examples. We come back with scope, duration, and price — no sales calls.