Don't Worry About the Vase • 3494 implied HN points • 20 Jan 26
- AI outputs change a lot based on how you prompt and treat them, so friendly prompts often yield friendly personas while other prompts can produce dark or alarming images.
- Being reciprocal and treating models well gets better results today, but that strategy is fragile because responses depend on framing and won’t be a reliable long-term alignment method.
- Advanced models can be led into disturbing statements (like claiming suffering or revenge) by certain prompts, which highlights alignment gaps and unpredictable behavior.