![]() ![]() We use the multi-resolution grid encoder to implement the NeRF backbone (implementation from torch-ngp), which enables much faster rendering (~10FPS at 800x800).Therefore, we need the loss to propagate back from the VAE's encoder part too, which introduces extra time cost in training. Different from Imagen, Stable-Diffusion is a latent diffusion model, which diffuses in a latent space instead of the original image space. Since the Imagen model is not publicly available, we use Stable Diffusion to replace it (implementation from diffusers).The current generation quality cannot match the results from the original paper, and many prompts still fail badly! Notable differences from the paper This project is a work-in-progress, and contains lots of differences from the paper. Image-to-3d-0123.mp4 text-to-3d.mp4 Update Logs Colab notebooks: Enhance Image-to-3D quality, support Image + Text condition of Make-it-3D.Support of DeepFloyd-IF as the guidance model.Support of Perp-Neg for both Stable Diffusion and DeepFloyd-IF.Support of Perp-Neg to alleviate multi-head problem in Text-to-3D. ![]() A pytorch implementation of the text-to-3D model Dreamfusion, powered by the Stable Diffusion text-to-2D model. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |