If language models take over text content, content creators will flee even quicker into creating video content. There's already a trend where younger people tend to prefer video for being more "genuine", and now it might become a sign of "human made" for a couple years. Also easier to monetize, and easier to build parasocial relationships, so all around a plus for creators. Too bad I prefer text.
I think the push to video and away from text is a net failure for accessibility and usability, at least for reference use cases.
My example: as a woodworker, I'm often curious about the details of a particular joint or the usage of a particular tool. The great amount of content on YouTube is helpful, but it's incredibly inefficient to have to seek through a bunch of filler or unrelated content to get the answer I need.
Of course, that's "increased engagement" so I'm not surprised it's what is commercially more viable.
That sounds remarkably similar to how recipes are shared in blogs. There's a huge amount of story, and then at the tail end there's the recipe. It's all for engagement, but I'm never engaged. If I'm looking for a recipe, I want to know the recipe so I can make it. I don't care about what the blogger did last weekend or in college.
> There's a huge amount of story, and then at the tail end there's the recipe. It's all for engagement, but I'm never engaged.
It's not about engagement, it's about copyright.
Recipes - in the form of lists of ingredients and the method - are not typically protected.
However, add a huge rambling story about how Grandma handed this recipe down to you when you were five and on holiday with her in $place, hey presto, it's protected.
It's not for engagement. Some sites have now a Jump to recipe button. It's for google that said that if you write normal text they will send you a ton of traffic. What people figured out was that unless you spam the recipe with keywords repeated at least 20 times, the google bot will not understand what the text is about. Maybe google was forced to do this, but that's how it works and it contradicts how they said it works.
Google* how long to pressure cook white or brown rice and you’ll see widely differing answers. Like shots all over a dartboard. They can’t all be correct — it’s just rice.
I wonder if many of them care more about CPM rates and page visits than actual recipe accuracy.
*or Bing, DDG, Kagi, etc if you prefer although I haven’t tried.
I would somewhat disagree with that.
My household eats rice on a daily basis and the timings for different kinds of rice varies wildly.
Basmati, Sona masuri, jasmine, risotto, jeera samba rice have very different water and rice measures. And that's just white rice!
Other rice variations are a whole different ball game.
I strongly recommend the books Cooking for Geeks and The Food Lab. In both books, the authors explore a variety of different approaches and show their math.
second order effects of this preference for video is how poorly video content gets indexed.
With text, searching of obscure things is cumbersome but possible. With video its impossible.
Meaning I, as a user cannot take the shortest path to my target content simply because of the medium.
I now default to looking for really old books on my topic of interest, or authoritative sources like textbooks and official documentation and then skim and weed through them to get to a broader understanding. Very often this has led to me on to better questions on that topic.
Online I prefer to look at search results from focussed communities, reddit, HN, StackOverflow, car forums, etc. I just never go to video for anything beyond recipes , quick fixes to broken appliances and kids videos.
I finally realized what actually bothers me about shopping physically vs online these days is (a) the lack of "sort by price, ascending" & (b) the lack of ability to get a reference or "fair" price for similar items.
Similar, with video the key missing feature is deep search.
It's mind bogglingly sad YouTube didn't focus more on improving this after being acquired: they have all the components to build a solution! And it's a natural outgrowth of Google's dead tree book digitization efforts!
I assume it was harder than just relying on contextual signals (links and comment text) to classify for ad targeting purposes. Also probably why they also incentivized ~10 min videos over longer/shorter.
Which is sufficient for advertisers, but utterly useless for viewers.
It makes me cry that we're missing a future where I could actually get deep links to the portion of all videos that reference potatoes (or whatever).
That actually seems like a great use case for AI; identify all videos about (topic), differentiate between high and low quality ones (as preferred by you or people similar to you), abstract the information into conceptual videos or schematic diagrams as you prefer.
May I suggest a simpler and smaller scope? An AI converting speech to text, extracting a bunch of still frames (or short video rolls) as illustrations (where relevant) and making it an ol' good readable article?
Then it can be fed to the search engines and those would do the rest of the job just fine.
> That actually seems like a great use case for AI; identify all videos about (topic), differentiate between high and low quality ones (as preferred by you or people similar to you), abstract the information into conceptual videos or schematic diagrams as you prefer.
Q: Why would your $videoPlatformOfChoice allow a commercial AI bot to scrape boatloads of videos, abstract the information, then serve that information separately somewhere else .. possibly while serving their own ads(!)?
SponsorBlock is the response. It's a crowdsourced extension that labels parts of the video, like sponsor segments, highlights, intro/outro, etc. Very useful, you can skip through useless segments.
I prefer text too but I feel like that's mostly because the videos are not information dense on purpose. They expand to whatever the youtube algorithm prefers at the time, which is about 10 minutes now. Ironically, tiktoks are more information dense but the search is completely useless.
I think we're very close to the point that even video won't be confirmable to be genuine. If it could even really be said to be so now. (Instagram/TikTok are the most performative/contrived content platforms these days)
Nope, there are already several services transcribing the audio content of video so expect that to be ingested too. You’ve seen the video suggestions with timestamps in google search right?
Oh, I'm aware ofhow well video transcription works. Once the lower-hanging fruit are dealt with, video content will absolutely flow into language models. But still, the video component is a key differentiator that AI can't easily mimick right now (at least not to a level where we can't tell). So users that want a personal opinion instead of a GPT-generated text are likely to turn to consuming videos.
The digital world is the native environment for the AI race we're creating. In that world us biological humans are relatively slow and inferior. And if this "handing the intelligence baton to machines" trend continues then "regression" to our more native communication forms feels natural and inevitable.
That's some interesting insight. Thank you. When I read your comment, I was envisioning us all sitting around fires in caves with animal skin togas talking about the latest HN post (which presumably was Carl scribbling down something on the rock wall).
Good, the less I have to see of their clickbait and the more time my competitors waste watching videos the better. Video has its uses and when it's good it's very very good, but most of the time it's terrible dreck that steals people's time using cheap emotional manipulation.
I've been thinking about training an ML model to detect those 'Pick Me!' poster frames that highlight the e-celeb presenter making some kind of dramatic reaction face and just filter them out of search results. This is partly what happens when SEO types combine with black box algorithms; the lowest common denominator content starts to swamp everything else, a kind of weaponized reversion to the mean.
there are already custom AI avatars and text to speech, there are already people using GPT to create text and then using other services to create the audio and dynamic videos at scale
Exactly. Several of the highly ranked YouTube videos that were recommended to me recently were clearly made by some AI doing a mashup of imagery with some text spoken by some text-to-speech algorithm.
> " could it somehow get access to the subtitles and then use them to answer queries?"
It's not even necessary - computers are already excellent at understanding spoken words. Have you tried automatic captioning recently? Half the inputs to my phone are already voice, not text.
Video is a harder problem, but it's not too far behind.
Exactly, and many bots exist today to mine user videos for the automated subtitle information. In other words, there's no escaping GPT from learning from any kind of medium.