The skills that matter most to me are the ones I create myself (with the skill creator skill) that are very specific and proprietary. For instance, a skill on how to write a service in my back-testing framework.
I do also like to make skills on things that are more niche tools, like marimo (a very nice jupyter replacement). The model probably does known some stuff about it, but not enough, and the agent could find enough online or in context7, but it will waste a lot of time and context in figuring it out every time. So instead I will have a deep thinking agent do all that research up front and build a skill for it, and I might customize it to be more specific to my environment, but it's mostly the condensed research of the agent so that I don't need to redo that every time.
That's where the humanizers come in. These are solutions that take LLM generated text and make it sound human written to avoid detection.
The principle of training them is quite simple. Take an LLM and reward it for revising text so that it doesn't get detected. Reinforcement learning takes care of the rest for you.
It makes it look like the presentation is rushed or made last minute. Really bad to see this as the first plot in the whole presentation. Also, I would have loved to see comparisons with Opus 4.1.
Edit: Opus 4.1 scores 74.5% (https://www.anthropic.com/news/claude-opus-4-1). This makes it sound like Anthropic released the upgrade to still be the leader on this important benchmark.
You would think that Springer did the due diligence here, but what is the value of a brand such as Springer if they let these AI slops through their cracks?
This is an opportunity for brands to sell verifiability, i.e., that the content they are selling has been properly vetted, which was obviously not the case here.
Back when I was doing academic publishing I'd use a regex to find all the hyperlinks, then a script (written by a co-worker, thanks again Dan!) to determine if they were working or no.
In the past I've had GPT4 output references with valid DOIs. Problem was the DOIs were for completely different (and unrelated) works. So you'd need to retrieve the canonical title and authors for the DOI and cross check it.
I work on Veracity https://groundedai.company/veracity/ which does citation checking for academic publishers. I see stuff like this all the time in paper submissions. Publishers are inundated
Not all journals require a DOI link for each reference. Most good ones do seem to have a system to verify the reference exists and is complete; I assume there’s some automation to that process but I’d love to hear from journal editorial staff if that’s really the case.
Why would one think that? All of the big journal publishers have had paper millers and fraudsters and endless amounts of "tortured phrases" under their names for a long, long time.
Taurine deficiency has been claimed to be a driver of aging [1]. The claim from the news article about it possibly being related to cancer seems like it needs a much stronger justification.
reply