Is it 0.003 per minute of audio uploaded, or "compute minute"?
For example fal.ai has a Whisper API endpoint priced at "$0.00125 per compute second" which (at 10-25x realtime) is EXTREMELY cheaper than all the competitors.
If Voxtral can process rapid speech as well as it claims to, an obvious cost optimization would be to speed up normal laconic speech to the maximum speed the model can handle accurately.
WER is slightly misleading, but Whisper Large v3 WER is classically around 10%, I think, and 12% with Turbo.
The thing that makes it particularly misleading is that models that do transcription to lowercase and then use inverse text normalization to restore structure and grammar end up making a very different class of mistakes than Whisper, which goes directly to final form text including punctuation and quotes and tone.
But nonetheless, they're claiming such a lower error rate than Whisper that it's almost not in the same bucket.
On the topic of things being misleading, GPT-4o transcriber is a very _different_ transcriber to Whisper. I would say not better or worse, despite characterizations such. So it is a little difficult to compare on just the numbers.
There's a reason that quite a lot of good transcribers still use V2, not V3.
Maybe it's because I'm not used to the flow, but I prefer to work directly on the machine where I'm logged in via ssh, instead of working "somewhere in a git tree", and then have to deploy/test/etc.
Once this app (or a similar app by Anthropic) will allow me to have the same level of "orchestration" but on a remote machine, I'll test it.
We have not released the weights, but it is fully available to use in your websites or applications. I can see how our wording there could be misconstrued -- sorry about that. You can absolutely create a vTuber persona. The link in the post is still live if you want to create one (as simple as uploading an image, selecting a voice, and defining the personality). We even have a prebuilt UI you can embed in a website, just like a youtube video.
The few people looking at /new on HN are ridiculously overpowered. A few upvotes from them in the few hours will get you to the front page, and just 1-2 downvotes will make your post never see the light of day.
You can't downvote a post, so that's not a factor.
Also it's not as powerful as you think. In the past I have spent a lot of time looking at /new, and upvoting stories that I think should be surfaced. The vast majority of them still never hit near the front page.
It's a real shame, because some of the best and most relevant submissions don't seem to make it.
If you are in a company like e.g. ClickHouse and share a new HN Submission of ClickHouse via the internal Slack to #general, then you easily get enough upvotes for the front page.
Why wouldn't I be able to fix these things? If I managed to build a thing from scratch (with Opus 4.5), I don't see why I wouldn't be able to fix it and maintain it in the future (maybe with Opus 4.7 or even better future models?).
Which is exactly why whenever I have an idea I just tinked with ClaudeCode for an hour or so until I have exactly what I need. It takes less time than trying to compare 10 similar products, none of which have the exact specifications or features that I need.
Tens of small/one-time apps or scripts that I needed done, and Claude provided them in seconds.
A few medium to big projects:
- a scraper of product pricing in shops near me, to track inflation over time
- a clone of typeform, but more customized on my needs
- end-to-end automation of managing facebook ads campaign (create/track/scale)
- dashboard to automate managing comments on multiple facebook pages
- a classic polymarket bot
- a pdf editor inbrowser so all my data stays local
- a landing page generator for ecommerce, just give the product description
- a slideshow generator using nanobananapro
- an infinite canvas to work in to generate images, with nodes
- agent automations to test AI voice agents in calls
Anything that comes to mind I can setup and deploy in a few minutes.
Nothing groundbreaking but it's all stuff that I didn't know how to do before, and now I know how to build/maintain/backup/upgrade with ClaudeCode. I know most senior devs would say "well this was all doable before" but they forget that not everyone had all the necessary skills to do all this stuff. Now it's a one man job.
reply