More

mdrzn · 2026-02-09T10:19:09 1770632349

Seems crazy to me that yCombinator is funding these OpenClaw wrappers.. hype based on hype.

mdrzn · 2026-02-04T16:09:38 1770221378

Is it 0.003 per minute of audio uploaded, or "compute minute"?

For example fal.ai has a Whisper API endpoint priced at "$0.00125 per compute second" which (at 10-25x realtime) is EXTREMELY cheaper than all the competitors.

Oras · 2026-02-04T16:30:51 1770222651

I think the point is having it for real-time; this is for conversations rather than transcribing audio files.

jamilton · 2026-02-04T17:52:51 1770227571

That quote was for the non-realtime model.

85392_school · 2026-02-05T22:33:32 1770330812

It can actually go much lower. Gemini costs around $0.01/hour of transcription last time I checked.

tgrowazay · 2026-02-05T05:08:45 1770268125

Both AWS and Mistral prices above are per minute of input audio.

Curiositry · 2026-02-06T19:27:51 1770406071

If Voxtral can process rapid speech as well as it claims to, an obvious cost optimization would be to speed up normal laconic speech to the maximum speed the model can handle accurately.

mdrzn · 2026-02-04T16:03:32 1770221012

There's no comparison to Whisper Large v3 or other Whisper models..

Is it better? Worse? Why do they only compare to gpt4o mini transcribe?

tekacs · 2026-02-04T16:11:19 1770221479

WER is slightly misleading, but Whisper Large v3 WER is classically around 10%, I think, and 12% with Turbo.

The thing that makes it particularly misleading is that models that do transcription to lowercase and then use inverse text normalization to restore structure and grammar end up making a very different class of mistakes than Whisper, which goes directly to final form text including punctuation and quotes and tone.

But nonetheless, they're claiming such a lower error rate than Whisper that it's almost not in the same bucket.

tekacs · 2026-02-04T16:12:00 1770221520

On the topic of things being misleading, GPT-4o transcriber is a very _different_ transcriber to Whisper. I would say not better or worse, despite characterizations such. So it is a little difficult to compare on just the numbers.

There's a reason that quite a lot of good transcribers still use V2, not V3.

satvikpendem · 2026-02-04T16:41:08 1770223268

Different how?

GaggiX · 2026-02-04T16:07:35 1770221255

Gpt4o mini transcribe is better and actually realtime. Whisper is trained to encode the entire audio (or at least 30s chunks) and then decode it.

mdrzn · 2026-02-04T16:10:28 1770221428

So "gpt4o mini transcribe" is not just whisper v3 under the hood? Btw it's $0.006 / minute

For Whisper API online (with v3 large) I've found "$0.00125 per compute second" which is the cheapest absolute I've ever found.

GaggiX · 2026-02-04T16:13:00 1770221580

>So it's not just whisper v3 under the hood?

Why it should be Whisper v3? They even released an open model: https://huggingface.co/mistralai/Voxtral-Mini-4B-Realtime-26...

breisa · 2026-02-04T19:05:21 1770231921

Deepinfra offers Whisper V3 at 0.00045$ / minute of transcribed audio.

emmettm · 2026-02-04T16:09:51 1770221391

The linked article claims the average word error rate for Voxtral mini v2 is lower than GPT-4o mini transcribe

GaggiX · 2026-02-04T16:11:11 1770221471

Gpt4o mini transcribe is better than whisper, the context is the parent comment.

mdrzn · 2026-02-02T18:33:09 1770057189

Maybe it's because I'm not used to the flow, but I prefer to work directly on the machine where I'm logged in via ssh, instead of working "somewhere in a git tree", and then have to deploy/test/etc.

Once this app (or a similar app by Anthropic) will allow me to have the same level of "orchestration" but on a remote machine, I'll test it.

indigodaddy · 2026-02-03T02:02:48 1770084168

Not going to solve your exact problem but I started this project with this approach in mind

https://github.com/jgbrwn/vibebin

mdrzn · 2026-01-28T10:04:18 1769594658

"we're releasing our new model" is it downloadable and runnable in local? Could I create a "vTuber" persona with this model?

andrew-w · 2026-01-28T15:33:29 1769614409

We have not released the weights, but it is fully available to use in your websites or applications. I can see how our wording there could be misconstrued -- sorry about that. You can absolutely create a vTuber persona. The link in the post is still live if you want to create one (as simple as uploading an image, selecting a voice, and defining the personality). We even have a prebuilt UI you can embed in a website, just like a youtube video.

mdrzn · 2026-01-26T16:30:38 1769445038

Posted 5 times in the last 7 days, today it finally got 29 points with 0 comments? Weird.

mythz · 2026-01-26T16:35:55 1769445355

Most announcements slip through without notice, it only picks up votes when it hits the main page.

v1 also took a while to make it to HN, v3 is a complete rewrite focused on extensibility with a lot more new features.

digiown · 2026-01-26T16:49:18 1769446158

The few people looking at /new on HN are ridiculously overpowered. A few upvotes from them in the few hours will get you to the front page, and just 1-2 downvotes will make your post never see the light of day.

freedomben · 2026-01-26T17:03:10 1769446990

You can't downvote a post, so that's not a factor.

Also it's not as powerful as you think. In the past I have spent a lot of time looking at /new, and upvoting stories that I think should be surfaced. The vast majority of them still never hit near the front page.

It's a real shame, because some of the best and most relevant submissions don't seem to make it.

tuhgdetzhh · 2026-01-26T17:22:18 1769448138

If you are in a company like e.g. ClickHouse and share a new HN Submission of ClickHouse via the internal Slack to #general, then you easily get enough upvotes for the front page.

oceansweep · 2026-01-26T17:20:58 1769448058

You can absolutely downvote posts. You have to have a certain amount of karma before the option becomes available.

digiown · 2026-01-26T17:39:39 1769449179

No I was wrong. You can't downvote posts. Flags are used instead, apparently.

freedomben · 2026-01-27T17:12:55 1769533975

Yes, and I will fully agree with you that flags are overpowered. That system does need to be re-worked IMHO.

nebezb · 2026-01-26T21:12:00 1769461920

freedomben has 28k karma. I don’t think the downvote button is coming.

lukan · 2026-01-27T00:10:49 1769472649

What is stopping you from joining those "ridiculously overpowered people"?

mdrzn · 2026-01-22T10:57:42 1769079462

Why wouldn't I be able to fix these things? If I managed to build a thing from scratch (with Opus 4.5), I don't see why I wouldn't be able to fix it and maintain it in the future (maybe with Opus 4.7 or even better future models?).

mdrzn · 2026-01-22T10:56:45 1769079405

Which is exactly why whenever I have an idea I just tinked with ClaudeCode for an hour or so until I have exactly what I need. It takes less time than trying to compare 10 similar products, none of which have the exact specifications or features that I need.

List of projects mentioned before: https://news.ycombinator.com/item?id=46716805

mdrzn · 2026-01-22T09:01:41 1769072501

Tens of small/one-time apps or scripts that I needed done, and Claude provided them in seconds.

A few medium to big projects:

- a scraper of product pricing in shops near me, to track inflation over time

- a clone of typeform, but more customized on my needs

- end-to-end automation of managing facebook ads campaign (create/track/scale)

- dashboard to automate managing comments on multiple facebook pages

- a classic polymarket bot

- a pdf editor inbrowser so all my data stays local

- a landing page generator for ecommerce, just give the product description

- a slideshow generator using nanobananapro

- an infinite canvas to work in to generate images, with nodes

- agent automations to test AI voice agents in calls

Anything that comes to mind I can setup and deploy in a few minutes.

Nothing groundbreaking but it's all stuff that I didn't know how to do before, and now I know how to build/maintain/backup/upgrade with ClaudeCode. I know most senior devs would say "well this was all doable before" but they forget that not everyone had all the necessary skills to do all this stuff. Now it's a one man job.

Etcpasswd · 2026-01-22T11:28:47 1769081327

I am using Opus -> Native app flow to build the apps I need quickly. This is crazy how much I don't know, but still getting stuff done.

alexb87 · 2026-01-25T05:01:22 1769317282

Are you doing offering services in this area ?

mdrzn · 2026-01-25T13:19:02 1769347142

Sure, my email in my profile! Or you can find my linkedin from my username.

mdrzn · 2026-01-20T09:13:06 1768900386

"my experience from 5 years of coding with AI" immediately disregarded the rest of TFA.