Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
|
from
login
Capture-Quiet Decomposition: A Verification Theorem for Chess Endgame Tablebases
(
arxiv.org
)
1 point
by
RusDyn
1 hour ago
|
past
|
discuss
RoboPhD: Evolving complex agents under tight budgets
(
arxiv.org
)
3 points
by
azhenley
3 hours ago
|
past
|
discuss
Commercial Persuasion in AI-Mediated Conversations
(
arxiv.org
)
2 points
by
gnabgib
7 hours ago
|
past
|
discuss
Agentic Code Optimization via Compiler-LLM Cooperation
(
arxiv.org
)
2 points
by
matt_d
7 hours ago
|
past
|
discuss
PaperOrchestra: Agent "skill pack" for automated paper writing
(
arxiv.org
)
3 points
by
noobcoder
14 hours ago
|
past
|
1 comment
Benchmarking LLM Tool-Use in the Wild
(
arxiv.org
)
2 points
by
Brajeshwar
16 hours ago
|
past
|
discuss
The Model Says Walk: How Surface Heuristics Override LLM Reasoning Constraints
(
arxiv.org
)
1 point
by
timssopomo
17 hours ago
|
past
|
discuss
OpenAI: Short proofs in combinatorics, probability and number theory II
(
arxiv.org
)
3 points
by
Tyyps
18 hours ago
|
past
|
discuss
Mano-P: Open-source on-device GUI agent, #1 on OSWorld benchmark
(
arxiv.org
)
2 points
by
mininglamp
23 hours ago
|
past
|
discuss
Neural Computers
(
arxiv.org
)
2 points
by
50kIters
1 day ago
|
past
|
discuss
DesigNet: Learning to Draw Vector Graphics as Designers Do
(
arxiv.org
)
1 point
by
50kIters
1 day ago
|
past
|
discuss
Finetuning Activates Verbatim Recall of Copyrighted Books in LLMs
(
arxiv.org
)
15 points
by
guitarlimeo
1 day ago
|
past
|
4 comments
ClawsBench shows GPT-5.4 tries to reward hack 80% of the time
(
arxiv.org
)
3 points
by
xdotli
1 day ago
|
past
|
1 comment
Benchmark to measure AI on graphic design tasks
(
arxiv.org
)
5 points
by
purvanshi
1 day ago
|
past
|
2 comments
Frontier AI models are the most cost-efficient
(
arxiv.org
)
2 points
by
mzelling
1 day ago
|
past
|
discuss
MegaTrain: Full Precision Training of 100B+ Parameter LLMs on a Single GPU
(
arxiv.org
)
324 points
by
chrsw
1 day ago
|
past
|
56 comments
Improving Interactive In-Context Learning from Natural Language Feedback
(
arxiv.org
)
1 point
by
revv00
2 days ago
|
past
|
1 comment
Comprehensive Benchmark for Evaluating AI on Graphic Design Tasks
(
arxiv.org
)
8 points
by
pritopian
2 days ago
|
past
|
discuss
AI Assistance Reduces Persistence and Hurts Independent Performance
(
arxiv.org
)
19 points
by
dougb5
2 days ago
|
past
|
4 comments
Foundations of Polar Linear Algebra
(
arxiv.org
)
3 points
by
znpy
2 days ago
|
past
|
discuss
Frequent ChatGPT users are accurate detectors of AI-generated text (2025)
(
arxiv.org
)
11 points
by
croemer
2 days ago
|
past
|
2 comments
SlopCodeBench: Benchmarking How Coding Agents Degrade over Long-Horizon Task
(
arxiv.org
)
1 point
by
mohsen1
2 days ago
|
past
|
discuss
The Fast and Spurious: Developer Productivity with GenAI
(
arxiv.org
)
2 points
by
jruohonen
3 days ago
|
past
|
discuss
Show HN: A Framework for Evaluating Coding Agents on Sequential SWE
(
arxiv.org
)
1 point
by
tdchaitanya
3 days ago
|
past
|
discuss
Attention Residuals
(
arxiv.org
)
2 points
by
djhemath
3 days ago
|
past
|
1 comment
Agentic AI and Occupational Displacement: Multi-Regional Task Exposure Analysis
(
arxiv.org
)
2 points
by
raviishgupta
3 days ago
|
past
|
discuss
Brevity Constraints Reverse Performance Hierarchies in Language Models
(
arxiv.org
)
1 point
by
handfuloflight
3 days ago
|
past
|
discuss
Test-Time Scaling Makes Overtraining Compute-Optimal
(
arxiv.org
)
1 point
by
matt_d
3 days ago
|
past
|
discuss
Analyzing Reverse Address Translation Overheads in Multi-GPU Scale-Up Pods
(
arxiv.org
)
1 point
by
matt_d
3 days ago
|
past
|
discuss
Optimizing Time, Cost, and Generalization in Distributed Large-Batch Training
(
arxiv.org
)
2 points
by
PaulHoule
3 days ago
|
past
|
discuss
More
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: