Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
Automatically Jailbreaking Frontier Language Models with Investigator Agents (transluce.org)
2 points by simonpure 4 months ago | past
Automatically Jailbreaking Frontier Language Models with Investigator Agents (transluce.org)
2 points by piotrgrabowski 4 months ago | past
Investigating truthfulness in a pre-release o3 model (transluce.org)
4 points by boleary-gl 8 months ago | past
Investigating truthfulness in a pre-release o3 model (transluce.org)
4 points by Luc 8 months ago | past
Investigating truthfulness in a pre-release o3 model (transluce.org)
5 points by Philpax 8 months ago | past | 1 comment
Docent: A system for analyzing and intervening on agent behavior (transluce.org)
4 points by brimtown 9 months ago | past
Releasing AI-driven tools for understanding AI systems (transluce.org)
1 point by EvgeniyZh on Oct 24, 2024 | past
Monitor: An AI-Driven Observability Interface (transluce.org)
3 points by brimtown on Oct 23, 2024 | past | 1 comment
Show HN: Debugging LLM Failures Like "9.11 > 9.9" via Interpretability (transluce.org)
8 points by vvvhuang on Oct 23, 2024 | past | 1 comment
Scaling Automatic Neuron Description (Describing Every Neuron in Llama 3) (transluce.org)
8 points by ekzhang on Oct 23, 2024 | past | 1 comment

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: