3D Building Generation in Minecraft via Large Language Models
Authors: Shiying Hu, Zengrong Huang, Chengpeng Hu, Jialin Liu
Year: 2024
Source:
https://arxiv.org/abs/2406.08751
TLDR:
This paper presents a novel approach for generating 3D buildings in the game Minecraft using large language models (LLMs). The authors introduce the Text to Building in Minecraft (T2BM) model, which refines user prompts, decodes interlayer representation, and repairs discrepancies to create complete and functional structures within the game. The model supports the generation of facades, indoor scenes, and functional blocks like doors and beds. Experiments conducted with GPT-3.5 and GPT-4 demonstrate the potential of LLMs to produce buildings that align with human instructions, with GPT-4 showing superior performance. The study highlights the importance of prompt refinement and suggests that as LLMs evolve, they could be applied to more complex generation tasks in game environments.
Free Login To Access AI Capability
Free Access To ChatGPT
The paper explores the use of large language models (LLMs) for generating 3D buildings in Minecraft, proposing a Text to Building in Minecraft (T2BM) model that refines user prompts and transforms them into complete, functional in-game structures, with experiments showing the potential of LLMs to create buildings that meet human-specified requirements.
Free Access to ChatGPT
Abstract
Recently, procedural content generation has exhibited considerable advancements in the domain of 2D game level generation such as Super Mario Bros. and Sokoban through large language models (LLMs). To further validate the capabilities of LLMs, this paper explores how LLMs contribute to the generation of 3D buildings in a sandbox game, Minecraft. We propose a Text to Building in Minecraft (T2BM) model, which involves refining prompts, decoding interlayer representation and repairing. Facade, indoor scene and functional blocks like doors are supported in the generation. Experiments are conducted to evaluate the completeness and satisfaction of buildings generated via LLMs. It shows that LLMs hold significant potential for 3D building generation. Given appropriate prompts, LLMs can generate correct buildings in Minecraft with complete structures and incorporate specific building blocks such as windows and beds, meeting the specified requirements of human users.
Method
The authors used a methodology that involved creating a Text to Building in Minecraft (T2BM) model, which consists of three core modules: input refining, interlayer representation, and repairing. The input refining module enhances the quality of the user's description to improve the building generation process. The interlayer module serves as a JSON-based transformation layer between the textual description and the digital representation of the building in Minecraft. The repairing module corrects any errors in the interlayer representation before the building is generated in the game. The authors conducted experiments with GPT-3.5 and GPT-4 to evaluate the model's performance in generating complete and satisfactory buildings based on user prompts.
Main Finding
The authors discovered that large language models (LLMs) have significant potential for generating 3D buildings in Minecraft. They found that with appropriate prompts, LLMs could create complete building structures with specific functional blocks like doors and beds, meeting the requirements set by human users. The experiments revealed that refining the prompts improved the completeness and satisfaction of the generated buildings. Additionally, GPT-4 showed better performance than GPT-3.5 in generating buildings that met both completeness and satisfaction constraints. The study also highlighted the importance of prompt quality and the need for error repair in the interlayer representation to ensure accurate building generation.
Conclusion
The conclusion of the paper is that the application of large language models (LLMs) to 3D building generation in Minecraft is a promising approach. The proposed Text to Building in Minecraft (T2BM) model, which refines user prompts, encodes buildings through an interlayer, and corrects errors via repairing methods, has been validated through experiments to generate buildings that are both complete and satisfy user-specified requirements. The study also found that prompt refinement is crucial for improving the quality of the generated buildings, and GPT-4 outperformed GPT-3.5 in this regard. The authors suggest that future work could integrate repairing into prompt guidelines and expand the T2BM model to other game environments.
Keywords
Procedural content generation, LLMs, building generation, 3D generation, Minecraft, Text to Building in Minecraft (T2BM), interlayer representation, repairing module, GPT-3.5, GPT-4, completeness, satisfaction, functional blocks, doors, beds, prompt refinement, JSON, generative design, Minecraft version, error handling, flood-fill algorithm, Minecraft Wiki, game environments.
Powered By PopAi ChatPDF Feature
The Best AI PDF Reader