CasaLingua
Multilingual AI assistant that simplifies housing applications. Capstone project for Columbia's Justice Through Code program.
What it is
A multilingual AI voice and text assistant for simplifying housing applications. Users submit input as audio, image, or text in any language; CasaLingua routes it to the appropriate processor, extracts the information needed to populate a housing form, and returns a simplified, readable output with an audit trail.
Capstone project for Columbia University’s Justice Through Code program. Python backend, modular pipeline, self-hosted.
Why housing
Housing applications are the canonical case for language accessibility as a safety-critical domain. A form filled out wrong, or skipped because the applicant can’t read it, can cost someone a roof. The failure mode isn’t “user has a bad experience” — it’s “user doesn’t get housed.” The posture required is closer to a clinical pipeline than a translation convenience tool.
Architecture
The system is organized around a pipeline of narrow, replaceable modules:
- Input funnel — detects file type (
.txt,.pdf, image, audio) and routes to OCR, ASR, or direct text handling. - Text pipeline — orchestrates named-entity recognition, translation, and governance checks over the routed input.
- Translation — translates to or from English, with simplification of legal and bureaucratic phrasing.
- NER — extracts identity and application fields (name, address, DOB, etc.) to auto-populate housing forms.
- Governance layer — applies ethical audits over every output: bias checks, fidelity checks, PII compliance. Returns a confidence score and alignment check on every response.
- Admin panel — operator surface for reviewing audits and output quality.
The governance layer is the part worth calling out. It is not a post-hoc content filter; it is a first-class pipeline stage whose output is stored alongside the user-facing response. Every translation and every field extraction carries an audit record. In a domain where a wrong answer costs someone housing, “the system said so” is not good enough — the system has to show its work.
The lesson
The architectural decision that mattered was making governance a pipeline stage, not a middleware layer. Middleware filters can be disabled or bypassed; pipeline stages produce artifacts that are part of the response record. If you can’t point to the audit trail, you can’t claim the output is safe — and in services adjacent to housing, legal, and healthcare, that’s the whole game.