return to table of content

SMERF: Streamable Memory Efficient Radiance Fields

barrkel
16 replies
22h26m

The mirror on the wall of the bathroom in the Berlin location looks through to the kitchen in the next room. I guess the depth gauging algorithm uses parallax, and mirrors confuse it, seeming like windows. The kitchen has a blob of blurriness as the rear of the mirror intrudes into kitchen, but you can see through the blurriness to either room.

The effect is a bit spooky. I felt like a ghost going through walls.

nightpool
12 replies
22h18m

The refigerator in the NYC scene has a very slick specular lighting effect based on the angle you're viewing it from, and if you go "into" the fridge you can see it's actually generating a whole 3d scene with blurry grey and white colors that turn out to precisely mimic the effects of the light from the windows bouncing off the metal, and you can look "out" from the fridge into the rest of the room. Same as the full-length mirror in the bedroom in the same scene—there's a whole virtual "mirror room" that's been built out behind the mirror to give the illusion of depth as you look through it. Very cool and unique consequence of the technology

pavlov
3 replies
22h3m

Wow, thanks for the tip. Fridge reflection world is so cool. Feels like something David Lynch might dream up.

A girl is eating her morning cereal. Suddenly she looks apprehensively at the fridge. Camera dollies towards the appliance and seamlessly penetrates the reflective surface, revealing a deep hidden space that exactly matches the reflection. At the dark end of the tunnel, something stirs... A wildly grinning man takes a step forward and screams.

throwaway17_17
2 replies
16h27m

Would you be offended if I animated that scene? It is really well described?

pavlov
0 replies
9h22m

Please feel free!

npace12
0 replies
15h14m

Please share if you do, that sounded spooky af

pjc50
1 replies
4h47m

Funnily enough, this is how reflections are usually emulated in game engines that do not support raytracing: another copy of the world behind the mirror. Also used in films in a few places (e.g. Terminator)

nightpool
0 replies
2h56m

Please look at the refrigerator I mentioned—it's definitely not the classic "mirror world" reflection that you'd normally see in video games. I'm talking about the specular / metallic highlights on the fridge being simulated entirely with depth features.

deltaburnt
0 replies
21h40m

Mirror worlds are a pretty common effect you'll see in NeRFs. Otherwise you would need a significantly more complex view dependent feature rendered onto a flat surface.

daemonologist
0 replies
22h1m

Neat! Here are some screenshots of the same phenomenon with the TV in Berlin: https://imgur.com/a/3zAA5K8

chpatrick
0 replies
21h3m

This happens with any 3D reconstruction. It's because any mirror is indistinguishable from a window into a mirrored room. The tricky thing is if there's actually a something behind the mirror as well.

alkonaut
0 replies
8h9m

What does the reconstructed space look like when there are opposing mirrors? It’ll just be a long corridor of ever more blurry rooms?

TaylorAlexander
0 replies
21h49m

Oh wow yeah. It's interesting because when I look at the fridge my eye maps that to "this is a reflective surface", which makes sense because that's true in the source images, but then it's actually rendered as a cavity with appropriate features rendered in 3D space. What's a strange feeling is to enter the fridge and then turn around! I just watched Hbomberguy's Patreon-only video on the video game Myst, and in Myst the characters are trapped in books. If you choose the wrong path at the end of the game you get trapped in a book, and the view you get trapped in a book looks very similar to the view from inside the NYC fridge!

Nevermark
0 replies
15h34m

Yes!

The barely-there reflection on the Berlin TV is also a trip to enter, and observe the room from.

rzzzt
0 replies
19h56m

You can also get inside the bookcase for the ultimate Matthew McConaughey experience.

rpastuszak
0 replies
8h59m

Try noclipping through the TV in the Berlin living room. It gets pleasantly creepy.

Zetobal
0 replies
19h59m

It has exactly the same drawbacks as photogrammetry in regards of highly reflective surfaces.

VikingCoder
7 replies
22h16m

Wow. Some questions:

Take for instance the fulllivingroom demo. (I prefer fps mode.)

1) How many images are input?

2) How long does it take to compute these models?

3) How long does it take to prepare these models for this browser, with all levels, etc?

4) Have you tried this in VR yet?

duckworthd
5 replies
20h23m

Glad you liked our work!

1) Around 100-150 if memory serves. This scene is part of the mip-NeRF 360 benchmark, which you can download from the corresponding project website: https://jonbarron.info/mipnerf360/

2) Between 12 and 48 hours, depending on the scene. We train on 8x V100s or 16x A100s.

3) The time for preparing assets is included in 2). I don't have a breakdown for you, but it's something like 50/50.

4) Nope! A keen hacker might be able to do this themselves by editing the JavaScript code. Open your browser's DevTools and have a look -- the code is all there!

duckworthd
2 replies
18h20m
nh2
1 replies
16h57m

What is the license? The repo doesn't say.

duckworthd
0 replies
11h21m

Oops, I need to update the license files.

Our code is released under the Apache 2.0 license, as in this repo: https://github.com/google-research/google-research/blob/mast...

dougmwne
1 replies
20h3m

Do you need position data to go along with the photos or just the photos?

For VR, there’s going to be some very weird depth data from those reflections, but maybe they would not be so bad when you are in headset.

duckworthd
0 replies
18h18m

Do you need position data to go along with the photos or just the photos?

Short answer: Yes.

Long answer: Yes, but it can typically be derived from images. Structure-from-motion methods are typically used to derive lens and position information for each photo in the training set. These are then used by Zip-NeRF (our teacher) and SMERF (our model) to train a model.

vyrotek
0 replies
20h57m

Not exactly what you asked for. But I recently came across this VR example using Gaussian Splatting instead. Exciting times.

https://twitter.com/gracia_vr/status/1731731549886787634

https://www.gracia.ai

modeless
6 replies
20h43m

How long until you can stitch Street View into a seamless streaming NeRF of every street in the world? I hope that's the goal you're working towards!

deelowe
2 replies
19h30m

I read another article talking about what waymo was working on and this looks oddly similar... My understanding is that the goal is to use this to reconstruct 3d models of street view images in real time.

duckworthd
1 replies
18h6m

Block-NeRF is a predecessor work that helped inspire SMERF, in fact!

https://waymo.com/research/block-nerf/

deelowe
0 replies
17h17m

Very cool. Thanks!

duckworthd
1 replies
20h21m

;)

modeless
0 replies
19h51m

Haha, too bad the Earth VR team was disbanded because that would be the Holy Grail. If someone can get the budget to work on that I'd be tempted to come back to Google just to help get it done! It's what I always wanted when I was building the first Earth VR demo...

xnx
0 replies
18h50m
edrxty
6 replies
18h32m

Is there any relation between this class of rendering techniques and the way the BD scenes in Cyberpunk 2077 were created? The behavior of the volume and the "voxels" seem eerily similar.

duckworthd
4 replies
18h29m

I can't say. I'm not familiar with BD in Cyberpunk.

edrxty
3 replies
18h25m

https://youtu.be/KXXGS3MGCro?t=118

It's a sort of replayable cutscene that happens a couple times in the game where you can wander through it. The noteworthy bit is it's rendered out of voxels that look very similar to the demos but at a much lower resolution and if you push the frustrum into any objects, you get the same kind of effect where the surface breaks into blocks.

duckworthd
2 replies
18h10m

Interesting effect. It does look very voxel-y. I'm not a video game developer at heart, so I can only guess how it was implemented. I doubt NeRF models were involved, but I wouldn't be surprised if some sort of voxel discretization was.

promiseofbeans
1 replies
18h6m

It seems like it might even just be some kind of shader

vanderZwan
0 replies
9h41m

If you think about how they created this from the POV of the game creation pipeline, then that probably is the way. If this is done by creating a shader on top of "plain old" 3D assets, then aside from the programmers/artists involved with creating that shader everyone else can go about their business with minimal retraining. There probably was a lot of content to create, to that optimization likely took priority over other methods of implementing this effect.

9dev
0 replies
10h59m

I doubt Cyberpunk uses more than a special shader for the BD sequences, but what’s a lot more remarkable to me is how similar the idea is at heart. Maybe we’re actually going to see this (maybe sans the brain-implant to record them, but hey) after all. Amazing technology, that’s for sure.

nojvek
5 replies
14h20m

Holy mother of god. Wow!

Either matterport takes and runs with this or this is a startup waiting to disrupt Realestate.

I can’t believe how smooth this ran on my smartphone.

Feedback: if there was a mode to use the phone compass and gyro for navigation, it’d feel natural. Felt weird to navigate with fingers and figure how to move in xyz dimension.

As others have said, VR mode would be epic.

tobr
2 replies
5h45m

Is this really something the real estate market wants though? The point of using styled and meticulously chosen images is to entice people to visit the property in person. I think it’s hard to fall for a home because you saw it through virtual reality.

iandanforth
0 replies
4h51m

It is, yes. If you browse zillow you'll find many homes have 3D views attached. These are often image-sphere captures that you can painfully move through by clicking. While I agree that full res photos can be more appealing, the user experience with SMERF is so much better it might leave end users with a more positive feeling about a property and thus increase the chances of a sale.

hobofan
0 replies
4h1m

I think it’s hard to fall for a home because you saw it through virtual reality.

I think if you take this 1-2 steps further and combine this with halucinating already owned furniture, or furniture that matches the prospective buyers taste into the property, this will make it a lot easier to fall for a home.

duckworthd
1 replies
11h24m

Thanks for the feedback!

I agree, we could do better with the movement UX. A challenge for another day.

nojvek
0 replies
6h44m

Since the viewer is on GitHub, I’ll take it for a spin.

Are you accepting pull requests?

xnx
4 replies
18h53m

Does the an open source toolchain exist for capturing, processing, and hosting navigable 3D walkthroughs like this (e.g. something like an open-source Matterport)?

duckworthd
2 replies
18h39m

Not yet, as far as I'm aware. The current flow involves a DSLR for capture, COLMAP for camera parameter estimation, one codebase for training a teacher model, our codebase for training SMERF, and our web viewer for rendering models.

Sounds like an opportunity!

strofocles
1 replies
9h32m

Is there a significant advantage for capturing using DSLRs vs using the phone camera of a decent phone?

duckworthd
0 replies
6h26m

The big difference is access to fisheye lenses a burst mode that can be run for minutes at a time, and the ability to minimize the amount of camera post processing. In principle, the capture could be done with a smartphone, but the experience of doing so is pretty time consuming right now.

gorkish
0 replies
18h29m

You don't need a toolchain for capturing; you just need the data. Get it now; process it when better tools become available. There are guides for shooting for Photogrammetry and NeRF that are generally applicable to what you need to do.

guywithabowtie
4 replies
22h51m

Any plans to release the models ?

duckworthd
3 replies
22h36m

The pretrained models are already available online! Check out the "demo" section of the website. Your browser is fetching the model when you run the demo.

ilaksh
2 replies
19h41m

Will the code be released, or an API endpoint? Otherwise it will be impossible for us to use it for anything.. since it's Google I assume it will just end up in a black hole like most of the research.. or five years later some AI researchers leave and finally create a startup.

duckworthd
1 replies
18h35m

I hope to release code in the new year, but it'll take a while. The codebase is heavily wired into other not-yet-open-sourced libraries, and it'll take a while to disentangle them.

ilaksh
0 replies
13m

That sounds terrific! I really appreciate your effort. It's amazing work and so great of you to share it.

catskul2
4 replies
21h14m

When might we see this in consumer VR? I'm surprised we don't already but I was suspecting it was a computation constraint.

Does this relieve the computation constraint enough to run on Quest 2/3?

Is there something else that would prevent binocular use?

duckworthd
2 replies
20h21m

I can't predict the future, but I imagine soon: all of the tools are there. The reason we didn't develop for VR is actually simpler than you'd think: we just don't have the developer time! At the end of the day, only a handful of people actively wrote code for this project.

nojvek
1 replies
14h16m

Any plans to open source the code?

duckworthd
0 replies
11h14m

Yes, I hope so! But it'll take at least a few months of work. We have some tight dependencies to not-yet-open-sourced code, and until that's released, any code we put out will be dead on arrival.

In the meantime, feel free to explore the live viewer code!

https://github.com/smerf-3d/smerf-3d.github.io/blob/main/vie...

doctoboggan
0 replies
20h29m

I recently got a new quest and I am wondering the same thing. The fact that this is currently running in a browser (and can run on a mobile device) gives me hope that we will see something like this in VR sooner rather than later.

yarg
3 replies
20h23m

What I'm seeing from all of these things is very accurate single navigable 3D images.

What I haven't seen anything of is feature and object detection, blocking and extraction.

Hopefully a more efficient and streamable codec necessitates the sort of structure that lends itself more easily to analysis.

duckworthd
1 replies
18h9m

3D understanding as a field is very much in its infancy. Good work is being done in this area, but we've got a long ways to go yet. SMERF is all about "view synthesis" -- rendering realistic images -- with no attempt at semantic understanding or segmentation.

cooper_ganglia
0 replies
16h4m

"It's my VR-deployed SMERF CLIP model with LLM integration, and I want it now!"

It is funny how quickly goalposts move! I love to see progress though, and wow, is progress happening fast!

yorwba
0 replies
7h14m

You mean something like this? https://jumpat.github.io/SA3D/

Found by putting "nerf sam segment 3d" into DuckDuckGo.

twelfthnight
3 replies
19h40m

Hope this doesn't come as snarky, but does Google pressure researchers to do PR in their papers? This really is cool, but there is a lot of self-promotion in this paper and very little discussion of limitations (and the discussion of them is bookended by qualifications why they really aren't limitations).

It makes it harder for me to trust the paper if I feel like the paper is trying to persuade me of something rather than describe the complete findings.

tomatotomato31
1 replies
19h35m

People are not allowed to be proud of their work anymore?

twelfthnight
0 replies
19h26m

Oh absolutely. I guess I just got the feeling reading this that there was more than the standard pride here and that there was professional PR going on. If no one else is getting that vibe I'm okay to accept it's just me.

duckworthd
0 replies
18h30m

I won't say too much about this, but the amount of buzz around articles these days is more of "research today" sort of thing. Top conferences like CVPR receives thousands of submissions each year, and there's a lot of upside to getting your work in front of as many eyeballs as possible.

By no means do I claim that SMERF is the be-all-end-all in real-time rendering, but I do believe it's a sold step in the right direction. There are all kinds of ways to improve this work and others in the field: smaller representation sizes, faster training, higher quality, and fewer input images would all make this technology more accessible.

slalomskiing
3 replies
13h31m

I wonder since this runs at real time framerate if it would be possible for someone to composite a regular rasterized frame on top of something like this (with correct depth testing) to make a game

For example a 3rd person game where the character you control and the NPCs/enemies is raster but the environment is all radiance fields

duckworthd
1 replies
11h16m

This should absolutely be possible! The hard part is making it look natural: NeRF models (including SMERF) have no explicit materials or lighting. That means that any character inserted into the game will look out of place.

vanderZwan
0 replies
9h40m

Why bother to make it look natural when you can have a really awkward greenscreen-like effect for nostalgic and "artistic" purposes?

jasonwatkinspdx
0 replies
13h25m

If you go to one of the demos the space bar will cycle through some debug modes. One shows a surface reconstruction. It comes from the usual structure from motion techniques I presume, so it's coarse and noisy, but I think the fundamental idea is viable.

promiseofbeans
3 replies
22h18m

It runs impressively well on my 2yo s21fe. It was super impressive how it streamed in more images as I explored the space. The tv reflections in the Berlin demo were super impressive.

My one note is that it look a really long time to load all the images - the scene wouldn't render until all ~40 initial images loaded. Would it be possible to start partially rendering as the images arrive, or do you need to wait for all of them before you can do the first big render?

duckworthd
2 replies
20h29m

Pardon our dust: "images" is a bad name for what's being loaded. Past versions of this approach (MERF) stored feature vectors in PNG images. We replace them with binary arrays. Unfortunately, all such arrays need to be loaded before the first frame can be rendered.

You do however point out one weakness of SMERF: large payload sizes. If we can figure out how to compress them by 10x, it'll be a very different experience!

promiseofbeans
1 replies
18h8m

Or even just breaking them down into smaller chunks (prioritise loading the ones closer to where the user is looking) could help

duckworthd
0 replies
11h22m

The viewer biases towards assets closer to user's camera (otherwise you'd have to load the whole scene!). We tried training SMERF with a larger number of smaller submodels, but at some point, it becomes too onerous to train and quality begins to suffer.

cubefox
3 replies
18h56m

Very impressive! Any information on how this compares to 3D Gaussian splatting in terms of performance, quality or data size?

duckworthd
2 replies
18h38m

All these details and more in our technical paper! In short: SMERF training takes much longer, SMERF rendering is nearly as fast as 3DGS when a CUDA GPU is available, and quality is visibly higher than 3DGS on large scenes and slightly higher on smaller scenes.

https://arxiv.org/abs/2312.07541

zyang
1 replies
14h34m

Is it possible to use zip-nerf to train GS to eliminate the floaters.

duckworthd
0 replies
6h26m

Maybe! That's the seed of a completely different research paper :)

annoyingnoob
3 replies
21h47m

There is a market here for Realtors to upload pictures and produce walk-throughs of homes for sale.

ibrarmalik
1 replies
20h11m
duckworthd
0 replies
18h36m

Be careful with this one! Luma's offering requires that the camera follow the recorded video path. Our method lets the camera go wherever you desire!

esafak
0 replies
20h49m
SubiculumCode
3 replies
21h43m

Im not sure why this demo runs so horribly in Firefox but not other browsers..anyone else having this?

daemonologist
1 replies
20h27m

Runs pretty well (20-100 fps depending on the scene) for me on both Firefox 120.1.1 on Android 14 (Pixel 7; smartphone preset) and Firefox 120.0.1 on Fedora 39 (R7 5800, 64 GB memory, RX 6600 XT; 1440p; desktop preset).

SubiculumCode
0 replies
19h36m

It seems that for some reason, my firefox is stuck in software compositor. I am getting:

WebRender initialization failed Blocklisted; failure code RcANGLE(no compositor device for EGLDisplay)(Create)_FIRST 3D11_COMPOSITING runtime failed Failed to acquire a D3D11 device Blocklisted; failure code FEATURE_FAILURE_D3D11_DEVICE2

I'm running a 3060

duckworthd
0 replies
18h33m

We unfortunately haven't tested our web viewer in Firefox. Let us know which platform you're running and we'll do our best to take a look in the new year (holiday vacation!).

In the meantime, give it a shot in a Webkit- or Chromium-based browser. I've had good results on Safari on iPhone, Chrome on Android/Macbook/Windows.

zeusk
2 replies
22h42m

Are radiance fields related to Gaussian splattering?

duckworthd
0 replies
22h37m

Gaussian Splatting is heavily inspired by work in radiance fields (or NeRF) models. They use much of the same technology!

corysama
0 replies
20h38m

Similar inputs, similar outputs, different representation.

sim7c00
2 replies
23h15m

this looks really amazing. i have a relatively old smartphone (2019) and its really surprisingly smooth and high fidently. amazing job!

duckworthd
1 replies
22h37m

Thank you :). I'm glad to hear it! Which model are you using?

sim7c00
0 replies
21h46m

samsung galaxy 10se

nox100
2 replies
20h57m

memory efficient? It downloaded 500meg!

bongodongobob
1 replies
20h53m

A. Storage isn't memory

B. That's hardly anything in 2023.

duckworthd
0 replies
20h14m

Right-o. The web viewer is swapping assets in and out of memory as the user explores the scene. The Network and disc requirements are high but memory usage is low.

jacoblambda
2 replies
22h27m

Is there a relatively easy way to apply these kinds of techniques (either NeRFs or gaussian splats) to larger environments even if it's lower precision? Like say small towns/a few blocks worth of env.

ibrarmalik
0 replies
19h50m

You’re under the right paper for doing this. Instead of one big model, they have several smaller ones for regions in the scene. This way rendering is fast for large scenes.

This is similar to Block-NeRF [0], in their project page they show some videos of what you’re asking.

As for an easy way of doing this, nothing out-of-the-box. You can keep an eye on nerfstudio [1], and if you feel brave you could implement this paper and make a PR!

[0] https://waymo.com/intl/es/research/block-nerf/

[1] https://github.com/nerfstudio-project/nerfstudio

duckworthd
0 replies
20h15m

In principle, there's no reason you can't fit multiple City blocks at the same time with Instant NGP on a regular desktop. The challenge is in estimating the camera and lens parameters over such a large space. I expect such a reconstruction to be quite fuzzy given the low space resolution.

zyang
1 replies
15h58m

Why is there a 300m^2 footprint limit if the sub-models are dynamically loaded. Is this constrained by training, rasterizing, or both?

duckworthd
0 replies
11h20m

In terms of the live viewer, there's actually no limit on footprint size. 300 m^2 is simply the biggest indoor capture we had!

yieldcrv
1 replies
19h35m

I had read about a competing technology that was suggesting NeRF's were a dead end

but perhaps that was biased?

duckworthd
0 replies
18h3m

You're probably thinking of 3D Gaussian Splatting (3DGS), another fantastic approach to real-time novel view synthesis. There's tons of fantastic work being built on 3DGS right now, and the dust has yet to settle with respect to which method is "better". Right now, I can say that SMERF has slightly higher quality on than 3DGS on small scenes and visibly higher quality on big scenes and runs on a wider variety of devices, but takes much longer than 3DGS to train.

https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/

tomatotomato31
1 replies
20h14m

I'm following this through two minutes paper and I'm looking forward to using it.

My grandpa died 2 years ago and in hindsight I took pictures for using them as in your demo.

Awesome thanks:)

duckworthd
0 replies
20h5m

It would be my dream to make capturing 3D memories as easy and natural as taking a 2D photos with your smartphone today. Someday!

smusamashah
1 replies
19h8m

This is very impressive but given its by Google, will some code ever be released?

duckworthd
0 replies
18h8m

I hope to release the code in the new year, but we have some big dependencies that need to be released worse. In the meantime, you can already begin hacking on the live viewer, https://github.com/smerf-3d/smerf-3d.github.io/blob/main/vie...

rzzzt
1 replies
19h34m

What kind of modes does the viewer cycle through when I press the space key?

duckworthd
0 replies
18h37m

Nice discovery :). Check the developer console: it'll tell you.

refulgentis
1 replies
22h27m

This is __really__ stunning work, huge, huge, deal that I'm seeing this in a web browser on my phone. Congratulations!

When I look at the NYC scene in the highest quality on desktop, I'm surprised by how low-quality ex. the stuff on the counter and shelves is. So then I load the lego model, and see that's _very_ detailed, so it doesn't seem inherent to the method.

Is it a consequence of input photo quality, or something else?

duckworthd
0 replies
20h7m

This is __really__ stunning work

Thank you :)

Is it a consequence of input photo quality, or something else?

It's more a consequence of spatial resolution: the bigger the space, the more voxels you need to maintain a fixed resolution (e.g. 1 mm^3). At some point, we have to give up spatial resolution to represent larger scenes.

A second limitation is the teacher model we're distilling. Zip-NeRF (https://jonbarron.info/zipnerf/) is good, but it's not _perfect_. SMERF reconstruction quality is upper-bounded by its Zip-NeRF teacher.

mdrzn
1 replies
5h24m

Impressive is not a big enough statement! This is incredibly smooth on my phone and crazy good on a desktop pc. Keep it up!

duckworthd
0 replies
4h2m

Thank you :)

jerpint
1 replies
21h38m

Just ran this on my phone through a browser, this is very impressive

duckworthd
0 replies
20h15m

Thank you :)

heliophobicdude
1 replies
21h55m

Great work!!

Question for the authors, are there opportunities, where they exist, to not use optimization or tuning methods for reconstructing a model of a scene?

We are refining efficient ways of rendering a view of a scene from these models but the scenes remain static. The scenes also take a while to reconstruct too.

Can we still achieve the great look and details of RF and GS without paying for an expensive reconstruction per instance of the scene?

Are there ways of greedily reconstructing a scene with traditional CG methods into these new representations now that they are fast to render?

Please forgive any misconceptions that I may have in advanced! We really appreciate the work y'all are advancing!

duckworthd
0 replies
20h17m

Are there opportunities, where they exist, to not use optimization or tuning methods for reconstructing a model of a scene?

If you know a way, let me know! Every system I'm aware of involves optimization in one way or another, from COLMAP to 3D Gaussian Splatting to Instant NGP and more. Optimization is a powerful workhorse that gives us a far wider range of models than a direct solver ever could. > Can we still achieve the great look and details of RF and GS without paying for an expensive reconstruction per instance of the scene?

In the future I hope so. We don't have a convincing way to generate 3D scenes yet, but given the progress in 2D, I think it's only a matter of time.

Are there ways of greedily reconstructing a scene with traditional CG methods into these new representations now that they are fast to render?

Not that I'm aware of! If there were, I think these works should be on the front page instead of SMERF.

germandiago
1 replies
18h47m

Amazing, impressive, almost unbelievable :O

duckworthd
0 replies
18h38m

Thank you!

fngjdflmdflg
1 replies
20h21m

Google DeepMind Google Research Google Inc.

What a variety of groups! How did this come about?

duckworthd
0 replies
18h36m

Collaboration is a thing at the Big G :)

durag
1 replies
22h13m

Any plans to do this in VR? I would love to try this.

duckworthd
0 replies
20h13m

Not at the moment but an intrepid hacker could surely extend our JavaScript code and put something together.

UPDATE: The code for our web viewer is here: https://github.com/smerf-3d/smerf-3d.github.io/blob/main/vie...

digdugdirk
1 replies
14h13m

Can you recommend a good entry point into the theory/math behind these? This is one of those true "wtf, we can do this now?" moments, I'm super curious about how these are generated/created.

duckworthd
0 replies
11h17m

Oof, there's a lot of machinery here. It depends a lot on your academic background.

I'd recommend starting with a tutorial on neural radiance fields, aka NeRF, (https://sites.google.com/berkeley.edu/nerf-tutorial/home) and an applied overview of Deep Learning with tools like PyTorch or JAX. This line of work is still "cutting edge" research, so a lot of knowledge hasn't been rolled up into textbook or article form yet.

blovescoffee
1 replies
21h57m

Since you're here @author :) Do you mind giving a quick rundown on how this competes with the quality of zip-nerf?

duckworthd
0 replies
19h45m

Check out our explainer video for answers to this question and more! https://www.youtube.com/watch?v=zhO8iUBpnCc

azinman2
1 replies
10h26m

Will there be any notebooks or other code released to train our own models?

duckworthd
0 replies
6h24m

I hope so, but it'll be a good while before we can release anything. We have tight dependencies to other not-yet-OSS libraries, and until they're released, our won't work either.

asgerhb
1 replies
8h30m

Wow! What am I even looking at here? Polygons, voxels, or something else entirely? How were the benchmarks recorded?

duckworthd
0 replies
6h28m

You're looking at something called a "neural radiance field" backed by a sparse, low resolution voxel grid and a dense high resolution triplane grid. That's a bit of a word soup, but you can think of it like a glowing fog rendered with ray marching.

The benchmark details are a bit complicated. Check out the technical paper's experiment section for the nitty gritty details.

aappleby
1 replies
22h30m

Very impressive demo.

duckworthd
0 replies
19h45m

Thank you!

westurner
0 replies
17h45m

"Researchers create open-source platform for Neural Radiance Field development" (2023) https://news.ycombinator.com/item?id=36966076

NeRF Studio > Included Methods, Third-party Methods: https://docs.nerf.studio/#supported-methods

Neural Radiance Field: https://en.wikipedia.org/wiki/Neural_radiance_field

monlockandkey
0 replies
20h44m

Get this on a VR headset and you have a game changer literally.

RagnarD
0 replies
6h24m

I'm curious how the creators would compare this to the capabilities of Unreal Engine 5 (as far as the display technology goes.)