NeRF Editing and Inpainting Techniques: Text-Guided Visual Content Generation

This paper proposes Inpaint4DNeRF to capitalize on state-of-the-art stable diffusion models for direct generation of the underlying completed background content


This content originally appeared on HackerNoon and was authored by Writings, Papers and Blogs on Text Models

:::info Authors:

(1) Han Jiang, HKUST and Equal contribution (hjiangav@connect.ust.hk);

(2) Haosen Sun, HKUST and Equal contribution (hsunas@connect.ust.hk);

(3) Ruoxuan Li, HKUST and Equal contribution (rliba@connect.ust.hk);

(4) Chi-Keung Tang, HKUST (cktang@cs.ust.hk);

(5) Yu-Wing Tai, Dartmouth College, (yu-wing.tai@dartmouth.edu).

:::

Abstract and 1. Introduction

2. Related Work

2.1. NeRF Editing and 2.2. Inpainting Techniques

2.3. Text-Guided Visual Content Generation

3. Method

3.1. Training View Pre-processing

3.2. Progressive Training

3.3. 4D Extension

4. Experiments and 4.1. Experimental Setups

4.2. Ablation and comparison

5. Conclusion and 6. References

2.3. Text-Guided Visual Content Generation

The advent of generative models has led to extensive research on guiding the generation results using natural language. For example, the latent diffusion model, as exemplified by [23], has made significant strides in text-guided image generation. Various image modification techniques, such as [7, 9, 10, 33], have emerged as a result of these improvements.

\ Based on the above text-to-image achievements, text-to3D generation has been introduced, as shown in [2, 12, 21, 32]. These approaches aim to bridge the gap in 3D content generation, leveraging the Score Distillation Sampling (SDS) technique and its variants for multiview convergence. Moreover, attempts have been made to generate 4D dynamic content from text [26], with several techniques including a temporal consistency regularizer to extend DreamFusion [21] to dynamic NeRFs. Despite the complexity of SDS sampling, they have achieved impressive results in terms of 3D consistency. On the other hand, our proposed method conditions on the seed view to control the generation of other views to force multiview convergence. This approach restricts the ill-posed text-guided generation problem to a well-posed problem with strong priors, thus making the problem easier to tackle. 3D generation conditioned on one generated view has been presented in some most recent works, including Zero123 [13] and SyncDreamer [16]. Given an image of an object and multiple camera poses, they can infer feasible observation of the object from other views.

\ However, existing implementations are all limited to a single object without conditioning on the background, and struggle with manipulating objects within large scenes. In other words, they all contribute to the pure generation task which is different from our inpainting task. Our method distinguishes itself by enabling the removal, addition, and manipulation of specific objects within a given background NeRF, while maintaining consistency with the unmasked background and partially masked foreground objects. In addition, the 3D inpainted results can be extended to 4D while maintaining temporal consistency.

\

:::info This paper is available on arxiv under CC 4.0 license.

:::

\


This content originally appeared on HackerNoon and was authored by Writings, Papers and Blogs on Text Models


Print Share Comment Cite Upload Translate Updates
APA

Writings, Papers and Blogs on Text Models | Sciencx (2024-07-18T21:00:24+00:00) NeRF Editing and Inpainting Techniques: Text-Guided Visual Content Generation. Retrieved from https://www.scien.cx/2024/07/18/nerf-editing-and-inpainting-techniques-text-guided-visual-content-generation/

MLA
" » NeRF Editing and Inpainting Techniques: Text-Guided Visual Content Generation." Writings, Papers and Blogs on Text Models | Sciencx - Thursday July 18, 2024, https://www.scien.cx/2024/07/18/nerf-editing-and-inpainting-techniques-text-guided-visual-content-generation/
HARVARD
Writings, Papers and Blogs on Text Models | Sciencx Thursday July 18, 2024 » NeRF Editing and Inpainting Techniques: Text-Guided Visual Content Generation., viewed ,<https://www.scien.cx/2024/07/18/nerf-editing-and-inpainting-techniques-text-guided-visual-content-generation/>
VANCOUVER
Writings, Papers and Blogs on Text Models | Sciencx - » NeRF Editing and Inpainting Techniques: Text-Guided Visual Content Generation. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2024/07/18/nerf-editing-and-inpainting-techniques-text-guided-visual-content-generation/
CHICAGO
" » NeRF Editing and Inpainting Techniques: Text-Guided Visual Content Generation." Writings, Papers and Blogs on Text Models | Sciencx - Accessed . https://www.scien.cx/2024/07/18/nerf-editing-and-inpainting-techniques-text-guided-visual-content-generation/
IEEE
" » NeRF Editing and Inpainting Techniques: Text-Guided Visual Content Generation." Writings, Papers and Blogs on Text Models | Sciencx [Online]. Available: https://www.scien.cx/2024/07/18/nerf-editing-and-inpainting-techniques-text-guided-visual-content-generation/. [Accessed: ]
rf:citation
» NeRF Editing and Inpainting Techniques: Text-Guided Visual Content Generation | Writings, Papers and Blogs on Text Models | Sciencx | https://www.scien.cx/2024/07/18/nerf-editing-and-inpainting-techniques-text-guided-visual-content-generation/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.