Submissions from arxiv.org

		Capture-Quiet Decomposition: A Verification Theorem for Chess Endgame Tablebases (arxiv.org)
		1 point by RusDyn 1 hour ago \| past \| discuss
		RoboPhD: Evolving complex agents under tight budgets (arxiv.org)
		3 points by azhenley 3 hours ago \| past \| discuss
		Commercial Persuasion in AI-Mediated Conversations (arxiv.org)
		2 points by gnabgib 7 hours ago \| past \| discuss
		Agentic Code Optimization via Compiler-LLM Cooperation (arxiv.org)
		2 points by matt_d 7 hours ago \| past \| discuss
		PaperOrchestra: Agent "skill pack" for automated paper writing (arxiv.org)
		3 points by noobcoder 14 hours ago \| past \| 1 comment
		Benchmarking LLM Tool-Use in the Wild (arxiv.org)
		2 points by Brajeshwar 16 hours ago \| past \| discuss
		The Model Says Walk: How Surface Heuristics Override LLM Reasoning Constraints (arxiv.org)
		1 point by timssopomo 17 hours ago \| past \| discuss
		OpenAI: Short proofs in combinatorics, probability and number theory II (arxiv.org)
		3 points by Tyyps 18 hours ago \| past \| discuss
		Mano-P: Open-source on-device GUI agent, #1 on OSWorld benchmark (arxiv.org)
		2 points by mininglamp 23 hours ago \| past \| discuss
		Neural Computers (arxiv.org)
		2 points by 50kIters 1 day ago \| past \| discuss
		DesigNet: Learning to Draw Vector Graphics as Designers Do (arxiv.org)
		1 point by 50kIters 1 day ago \| past \| discuss
		Finetuning Activates Verbatim Recall of Copyrighted Books in LLMs (arxiv.org)
		15 points by guitarlimeo 1 day ago \| past \| 4 comments
		ClawsBench shows GPT-5.4 tries to reward hack 80% of the time (arxiv.org)
		3 points by xdotli 1 day ago \| past \| 1 comment
		Benchmark to measure AI on graphic design tasks (arxiv.org)
		5 points by purvanshi 1 day ago \| past \| 2 comments
		Frontier AI models are the most cost-efficient (arxiv.org)
		2 points by mzelling 1 day ago \| past \| discuss
		MegaTrain: Full Precision Training of 100B+ Parameter LLMs on a Single GPU (arxiv.org)
		324 points by chrsw 1 day ago \| past \| 56 comments
		Improving Interactive In-Context Learning from Natural Language Feedback (arxiv.org)
		1 point by revv00 2 days ago \| past \| 1 comment
		Comprehensive Benchmark for Evaluating AI on Graphic Design Tasks (arxiv.org)
		8 points by pritopian 2 days ago \| past \| discuss
		AI Assistance Reduces Persistence and Hurts Independent Performance (arxiv.org)
		19 points by dougb5 2 days ago \| past \| 4 comments
		Foundations of Polar Linear Algebra (arxiv.org)
		3 points by znpy 2 days ago \| past \| discuss
		Frequent ChatGPT users are accurate detectors of AI-generated text (2025) (arxiv.org)
		11 points by croemer 2 days ago \| past \| 2 comments
		SlopCodeBench: Benchmarking How Coding Agents Degrade over Long-Horizon Task (arxiv.org)
		1 point by mohsen1 2 days ago \| past \| discuss
		The Fast and Spurious: Developer Productivity with GenAI (arxiv.org)
		2 points by jruohonen 3 days ago \| past \| discuss
		Show HN: A Framework for Evaluating Coding Agents on Sequential SWE (arxiv.org)
		1 point by tdchaitanya 3 days ago \| past \| discuss
		Attention Residuals (arxiv.org)
		2 points by djhemath 3 days ago \| past \| 1 comment
		Agentic AI and Occupational Displacement: Multi-Regional Task Exposure Analysis (arxiv.org)
		2 points by raviishgupta 3 days ago \| past \| discuss
		Brevity Constraints Reverse Performance Hierarchies in Language Models (arxiv.org)
		1 point by handfuloflight 3 days ago \| past \| discuss
		Test-Time Scaling Makes Overtraining Compute-Optimal (arxiv.org)
		1 point by matt_d 3 days ago \| past \| discuss
		Analyzing Reverse Address Translation Overheads in Multi-GPU Scale-Up Pods (arxiv.org)
		1 point by matt_d 3 days ago \| past \| discuss
		Optimizing Time, Cost, and Generalization in Distributed Large-Batch Training (arxiv.org)
		2 points by PaulHoule 3 days ago \| past \| discuss
		More