Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> because it's more likely to literally interpret that sequence and go into the latent space of drama

This always makes me wonder if saying some seemingly random of tokens would make the model better at some other task

petrichor fliegengitter azúcar Einstein mare könyv vantablack добро حلم syncretic まつり nyumba fjäril parrot

I think I'll start every chat with that combo and see if it makes any difference



There’s actually research being done in this space that you might find interesting: “attention sinks” https://arxiv.org/abs/2503.08908


No Free Lunch theorem applies here!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: