I've seen quite a few NeRF demos so far, but it appears that all of them are not...

tehsauce · on June 29, 2021

> I'm pretty sure that structure from motion + mesh reduction + normal mapping can reconstruct the original images almost pixel-perfect

This is actually incorrect, NeRF is the SotA method for these novel view synthesis benchmarks. If you could get better results using meshes, you’d be able to publish a paper and blow all these methods out of the water! That said as you point out there are still major limitations, particularly rendering speed. But the technique is very new and progressing rapidly.

wokwokwok · on June 29, 2021

I'm not going to mince words: NeRF is rubbish compared to traditional photogrammetry.

> NeRF is the SotA method for these novel view synthesis benchmarks

That's an interesting statement; uh, I don't know what 'novel view synthesis benchmarks' you're referring to, but the parent post didn't mention them, and like me, probably doesn't care what they are.

If the state of the art is an 800x800 pixel image... uh, well, bluntly, that's really very unimpressive.

> Compared to that, I find "AI rendering" which is blurry and much slower (15fps @ 800px) somewhat underwhelming.

^ This.

It's very much a 'watch this space' technology, because it does have some really interesting and promising features, and it's changing quickly, but I think finding it 'somewhat underwhelming' is a pretty fair response.

dTal · on June 29, 2021

No, I'm sorry but this blows photogrammetry out of the water. The original NeRF paper photorealistically handled complex occlusions (like foliage) and reflective and refractive caustics. No other technique comes close. That is the entire reason it's interesting, and believe it or not there are practical applications for it right now. Forget gaming - this lets you capture lightfields for VR with a cell phone in 5 minutes. And if the NeRFs themselves can be rendered in real time, it solves the problem of compressed light field scene representation. Buckle up for photorealistic VR.

wokwokwok · on June 30, 2021

> there are practical applications for it right now

I'm flat out skeptical you're not just waving your hands in the air vaguely.

Provide concrete examples of how it's used right now then, specifically in a scenario where traditional techniques don't work.

Not, "look at this youtube video that took 80 hours to render a 800x800 pixel image for our paper"; an actual practical application.

I've never seen NeRF deployed in anything other than a proof of concept or toy scenario.

dTal · on July 1, 2021

I just gave you one. You can now cheaply and rapidly capture dense lightfields of highly specular objects for VR display. Get yourself a camera array (100 cameras is not infeasible!) and you can capture them instantaneously. That's totally game changing compared to the current state of the art of scanning camera gantries (slow) or photogrammetry (fails on complex or highly specular geometry).

If you're asking me for an example of it being publicly used in production, well I think you're asking a lot considering the technique is only a few months old.

wokwokwok · on July 2, 2021

> If you're asking me for an example of it being publicly used in production, well I think you're asking a lot considering the technique is only a few months old

That is what I explicitly asked for.

You’re failure to provide an example is not because it’s new it’s because it’s actually not useful practically at the moment.

NeRF has been around since March 2020 (https://arxiv.org/abs/2003.08934); you are simply wrong; traditional techniques are better right now, have better implementations and are widely used.

NeRF is a promising technology that is categorically worse in its current implementation and maturity.

I don’t know what else to say.

fxtentacle · on June 29, 2021

The big fallacy with AI research is that people treat it as "completely new", so in their mind it doesn't make sense to compare the AI to traditional methods. But most traditional methods have also been created by highly advanced intelligences ... us humans.

FYI we once got 1st place on the Sintel AI benachmark in "Clean & EPE matched" with a 2004 paper... By now it's down to 10th place, but AI is by no means far ahead of traditional methods.

As for "novel view synthesis benchmarks", photogrammetry is used in many hollywood productions for virtual actors and/or for virtual environment destruction. In my opinion, having Hollywood use your technique for billion-dollar blockbuster movies is probably a hint that it works well in practice ;)

ryandamm · on June 29, 2021

I don't think "remixing source images" is a fair characterization of this research. Finding pixel-accurate synthetic images is actually super hard.

That said, yes, it's odd they're posing this as a way to extend game engines, when this entire line of research is interesting (to me) for its potential to be a generalized holographic codec. I'm not qualified to evaluate if this is generalizable, but I would hope these results also indicate a possible way forward for camera-derived footage, where there are not acceptable approaches. But to your point:

>As for the scenes in this demo, I'm pretty sure that structure from motion + mesh reduction + normal mapping can reconstruct the original images almost pixel-perfect, and at 200+ fps on modern GPUs.

I don't agree, we've seen attempts to do generalized photogrammetry as a holographic codec, and the results are poor. Microsoft's HoloCapture is probably the best commercial example, and, well... it ain't pretty. (https://holocap.com/) I think the target should be visual fidelity at parity with good 2D codecs. So there's a long ways to go.

The paucity of data sets may be underselling how hard this is and how advanced NeRF (and other approaches like MPI) are relative to advanced photogrammetry. In particular, approaches that extend to video and work well with the human face are very desirable, but there aren't a lot of datasets to test against. And yet these approaches _are_ significant improvements over photogrammetry; I'm not at liberty to share, but I've seen a demo comparing pure photogrammetry to a neural renderer, and the difference is night and day.

But let me agree with you: yes, we should throw these new representations at difficult subject matter, including reflections, translucency, refraction, subsurface scattering, etc, and see how they do. The original NeRF paper had some wild results for refraction, for example, which hadn't been demonstrated before. And when possible, I'd love to see them with video images of people.

I think we're only a few years away from a functional holographic codec, which is pretty exciting.

fxtentacle · on June 29, 2021

Holocap is the cheap amateur end of the market.

https://www.russian3dscanner.com/wrap4d/ is popular with Hollywood and it used to cost $10k+ before Epic Games purchased it. And it just happens to be amazing with animated faces ;)

That said, I agree with you in general that NeRF could become amazing if we can get it to work with difficult data sets where photogrammetry breaks down. That's why I find it sad to see them demo it on datasets where photogrammetry is known to work well.

taneq · on June 30, 2021

That starts with textured 3D scans and animates them. Seems like a completely different task than starting with photos and generating a textured mesh?

taneq · on June 30, 2021

Do you have any good references on SFM reconstruction? I spent a fair while playing around with various commercial and open source offerings (Alicevision Meshlab, RealityCapture, a few others I don't remember) a year or two ago and came away quite disillusioned. They look great with the carefully selected and optimized demo datasets but if you go out yourself and take a few hundred photos of anything but a nice convex boulder, the end result is almost always a complete mess.

banachtarski · on June 29, 2021

UE5 does not use geometric textures, nor does rasterization have anything to do with the lighting system (irradiance caching)