They analyse human perception too, in the form of videos.

nosianu · 2025-12-01T13:51:34 1764597094

Without any of the spatial and physical object perception you train from right after birth, see toddlers playing, or the underlying wired infrastructure we are born with to understand the physical world (there was an HN submission about that not long ago). Edit, found it: https://news.ucsc.edu/2025/11/sharf-preconfigured-brain/

They are not a physical model like humans. Ours is based on deep interactions with the space and the objects (reason why touching things is important for babies), plus mentioned preexisting wiring for this purpose.

esafak · 2025-12-01T14:40:10 1764600010

Multimodal models have perception.

lupire · 2025-12-01T14:59:37 1764601177

If s multimodal model were considered human, it would be diagnosed with multiple severe disabilities in its sensory systems.