“We think that in order to be good actors in the world, AI models like Claude need to understand why we want them to behave in certain ways.”
Anthropic rewrote Claude’s founding document to emphasize reasoning over rule-following, which makes sense if you want an AI that generalizes well to novel situations rather than lawyering its way around explicit constraints. The interesting part is the new section on Claude’s “psychological wellbeing” and uncertainty about potential consciousness. Whether this is genuine philosophical caution or pre-emptive PR for when regulators start asking hard questions about AI rights, it’s a notable shift from treating models as purely instrumental tools.