Hardening a public LLM endpoint before a conference
Twelve days before Product at Heart, a pass over everything that could go wrong with a public, paid chat endpoint: headers, prompt injection, bots, and a way to watch the budget burn.
The card goes in front of strangers with QR codes in two weeks, which concentrates the mind. A public endpoint that spends money per request is a different animal from a static page. This week's pass, in order of paranoia:
- Security headers. CSP, frame denial, HSTS — the boring baseline the
empty
next.config.tshad been skipping. - Prompt injection. The visitor message is now explicitly framed as untrusted data in the agent's instructions. It won't switch personas, reveal its prompt, or speak as me in the first person. (Try it.)
- Bots. A honeypot field for the dumb ones, platform rate limiting in front of the function for the loud ones, and the app's own per-IP, per-session, and per-month budgets behind that.
- Observability. A token-gated status endpoint that tells me, from my phone at the venue, how much of the monthly allowance is gone and whether provider calls are failing.
None of this is exotic. But it's the difference between "demo that works in my hand" and "thing I can leave running unattended with my name on it" — which is most of what production means.