Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
ramraj07
43 days ago
|
parent
|
context
|
favorite
| on:
Agent design is still hard
Its a 2 day project at best to create your own bespoke llm as judge e2e eval framework. Thats what we did. Works fine. Not great. Still need someone to write the evals though.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: