Digital Sets
How AI, Procedural Tech, and KitBashing will reduce digital set making by 100 times and lead to a new genre of filmmaking.
Sets play a pivotal role in storytelling, acting as the canvas upon which narratives unfold. Think of the haunting Overlook Hotel in "The Shining" or the vibrant streets of Montmartre in "Amélie". These environments don't just serve as backdrops; they influence the plot, emphasize the theme, and often become unseen characters, pivotal to the story's ambience.
Yet, while set making has advanced from on-location shoots, to meticulously crafted sets in Hollywood's Studio City, to the magic of green screens, and even to the latest virtual productions, a pressing issue arises: skyrocketing costs. For instance, the production budget of films like "Avengers: Endgame" soared to an estimated $356 million, and epic television shows like "Game of Thrones" had episodes costing upwards of $15 million each. The extensive personnel and time necessary for such productions further amplify these expenses, highlighting the financial strain of contemporary set creation.
In this edition of The Brief, I delve into an intriguing parallel between the history of car manufacturing and the progression of set making. By aligning these two separate evolutionary paths, distinct patterns begin to surface, providing us with valuable evolutionary guideposts. It's apparent we've reached a crossroads: our creative abilities have expanded to let us craft anything imaginable, but the bespoke nature of current methods comes with a hefty price tag.
Over the next two decades, I anticipate significant strides in the efficiencies of digital set creation. These evolutionary markers furnish a lens through which we can better understand our standing in the fast-paced realm of instant environment and digital set innovation.
A THEME TO REMEMBER. Every technological epoch undergoes a dual-phase progression: the pioneering phase where innovators focus on making the technology operational (e.g. it does the thing we want it to do - but usually at a tremendous cost), followed by an optimization phase that emphasizes efficiency and scale.
But first, a bit of history for context to put in memory while we contemplate digital set creation.
In the early days of automobile production, cars were primarily bespoke creations, handcrafted to meet individual specifications. Notable examples include luxury brands like Rolls-Royce and Bugatti, where each vehicle was tailored to its owner's preferences. As demand grew, the approach shifted to more standardized production methods. Shops began assembling cars using kits, where the frame, engine, and exterior components such as doors, hood, and trunk came as separate pieces, streamlining the process. Henry Ford, recognizing the potential for further efficiency, revolutionized this method by introducing the assembly line. This not only amplified production rates but also significantly reduced costs. This transformation in car manufacturing distinctly mirrors the dual-phase progression: initially, the emphasis was on achieving functionality with bespoke cars, which then transitioned to the pursuit of efficiency through assembly line production.
In the realm of digital sets, I propose a similar two-phase progression. The inaugural phase is centered around pioneering technology capable of crafting authentic, convincing scenes, with each component designed bespoke. We need to make grass believable as grass, brick as brick, clouds as clouds. Only when we attain the ability to render any conceivable scene, do we transition to the second phase: optimization and doing the same thing but with less cost, time, and resources. I contend that we are on the cusp of this second phase. Evidently, beginning last year and accelerating into this year, there's a noticeable shift in product innovations. A burgeoning proportion of new tools and solutions are no longer about crafting previously unachievable visuals; they are, instead, about refining and optimizing the creation of visuals we've already mastered.
In relation to digital set creation, this evolution towards optimization is centered around four key areas enhanced by integrated AI:
Creating Assets - creating the World Library of every object on earth
Organizing Assets - finding, cataloguing, and mapping the World Library
Composing Assets - creating digital sets using the World Library
Interacting with Assets - enabling objects in scenes to interact naturally
Creating Assets
We find ourselves in a period where it is conceivable to catalog every conceivable object on Earth as a 3D asset - a World Library. Initially, such an aspiration seemed implausible. The early days of 3D modeling presented challenges: constructing a believable coffee cup digitally could take hours or even days. Technologies like object scanning revolutionized this process, yet, scanning alone may not suffice to archive every single object. We're still in the nascent stages of curating this universal 3D catalog.
The key to unlocking this expansive World Library could lie in the domain of artificial intelligence, particularly in four areas 1) text to 3D assets 2) 2D photos to 3D assets, 3) 2D photos to 3D composited scenes and 3) text to 3D scene creation.
Text to 3D assets. In the evolving landscape of digital technology, it's conceivable that we might one day develop a proposed system where natural language processing meets cutting-edge computer graphics. Imagine a world where, instead of laboriously designing from scratch, one could simply instruct, "craft me a dragon reminiscent of the one in Game of Thrones." This potential software would then delve into myriad visual references, shaping a detailed 3D representation based on the directive. While still a concept, this idea highlights the aspirational crossroads of linguistic command and digital visualization, suggesting a future where our spoken imaginations could directly mold digital realities.
2D photo to 3D assets. Imagine an AI that can transform 2D photographs into precise 3D objects instantly. By analyzing shadows, texture, and light, this AI could extrapolate the object's full 3D structure. Such a tool could revolutionize the creation of a world library of 3D assets. Instead of manual modeling, users could supply photos to swiftly populate this library, democratizing 3D modeling and expanding the digital world's richness and variety.
2D photos to 3D scenes. "Dimensional Composite Imaging" (DCI): When discussing design, "destructive" techniques alter the original data permanently, making it challenging to revert or adjust later. In contrast, "non-destructive" methods ensure all modifications are layered atop the original, preserving its integrity. Now, envision a pioneering technology termed "Dimensional Composite Imaging" (DCI). Imagine an AI, akin to Midjourney, conceptualizing a digital set in 2D. This 2D rendition serves as the blueprint for a 3D generator. The generator's dual purpose? First, to craft photorealistic representations of each object in the visual, and second, to assemble those objects, constructing a comprehensive 3D environment. This seamless bridge between 2D visualization and 3D realization could well redefine the realms of digital design and set creation.
Text to World Library Compositing. As the World Library matures, consider the possibilities of AI drawing from its vast reserves to craft scenes. There might emerge a technological breakthrough where AI, leveraging the exhaustive World Library of 3D objects, transforms textual cues like "a 1940s living room with a fireplace and gramophone" directly into intricate 3D environments. Such an innovation would take a simple description and, referencing the World Library, render the envisioned 3D scene. This hints at a future where a storyteller's vision and set design converge flawlessly, streamlining the creative process and amplifying accuracy.
So if that’s where we are going, where are we now?
Drawing a parallel between the development of the World Library of assets and the early stages of the internet, we're essentially at its 1993 juncture. Despite the extensive offerings from major 3D asset libraries like TurboSquid, Sketchfab, and Unity's Asset Store, the total number of movie-ready assets may barely surpass 100,000. Given this, the EnviroGen model—generating 3D assets on demand—appears more viable than the DCI method of utilizing pre-existing assets. Until we witness a monumental leap in mass-producing 3D assets (DCI), the World Library's potential remains untapped, holding back substantial cost reductions in digital set creation.
That said, these concepts are intriguing startup ideas I'd be keen on funding or even launching myself (ehemmm…).
Organizing Assets
Now imagine that the World Library exists, or is well on the way to existing. The monumental task of organizing assets becomes increasingly complex as we transition from millions to billions, and even potentially trillions of assets. To comprehend the magnitude of this challenge, one can look to the evolution of the Internet. In the 1990s, the web consisted of a few thousand sites, navigable through rudimentary tools like Yahoo! Directories. Fast forward to today, and we're talking about billions of websites, necessitating the rise of sophisticated search engines like Google and Bing. More recently, advancements in AI, exemplified by models like ChatGPT, have further transformed how we access and interact with the vast expanse of information.
In the context of digital set creation, how do we measure our progress in asset organization? Drawing parallels with the early internet, it may be shocking to hear, but we're currently just moving beyond the equivalent of the Yahoo! Directory stage. The days of hand-tagging and manually adding metadata to assets are just now waning.
Two companies leading the charge in this realm are Polygonflow and Promethean AI.
ONES TO WATCH. Polygonflow has developed "Dash," a game-changing tool marketed as a "Copilot for World Building in Unreal Engine." Dash simplifies the traditionally intricate 3D design process, turning every step into either a prompt or a drag-and-drop action, allowing for more intuitive terrain creation, material application, and object placement.
ONES TO WATCH. Promethean AI, established by Andrew Maximov in 2018, is an autonomous asset generation tool, enabling artists and developers to create game assets just by describing them, streamlining the asset creation process immensely.
Both companies are heralding a new era in digital set creation by leveraging AI to simplify, optimize, and enhance the workflow, paving the way for more efficient and realistic digital worlds.
Composing Assets into Scenes
For years, digital set creation has leaned heavily on the manual efforts of artists, who would individually place each 3D object in a scene to achieve the desired realism. This exacting method demands not only careful positioning of each element, but also fine-tuning their scale, rotation, and orientation to fit seamlessly within the larger tableau. Beyond mere placement, an artist must grapple with shadow accuracy, light refraction, texture cohesion, and the nuanced interaction of all objects. Anecdotally, imagine the effort required in a dense forest scene, where every leaf, twig, and water droplet must be painstakingly placed and adjusted, consuming countless hours and significant monetary resources in the process.
Now, we're witnessing a profound shift in this domain, thanks to the advent of procedural and painting tools. With procedural environment building, vast terrains can be generated automatically based on predefined algorithms—think of the automated generation of a sprawling desert or intricate mountain range. Environment sculpting, on the other hand, lets artists mold scenes with the fluidity of shaping clay. Then there's the innovation of Unreal Engine blueprints, which has opened doors to rule-based environment creation, streamlining asset placement in alignment with specific conditions. The time, cost, and resources saved by these tools are substantial. A scene that might have once taken weeks of meticulous hand-placement could now be completed in days or even hours, leading to drastic reductions in both time and budget.
ONES TO WATCH. Errant Worlds. Errant Worlds offers cutting-edge Unreal Engine plugins, reshaping map creation. Seamlessly integrating with kit bash libraries like Quixel, it streamlines the design process, making expansive and intricate terrains more accessible. A VFX artist highlighted the tool's efficiency, noting, "Generating trees on a landscape, which once took 8 hours, now takes just five minutes with your plugin." At an affordable $268, Errant Worlds embodies the SMURF stack ethos, saving time, cost, and resources while revolutionizing digital creation.
ONES TO WATCH. KitBash3D: Since its inception in 2017, KitBash3D has firmly established itself in the digital design and filmmaking landscapes. Its transformative impact can be discerned through myriad success stories; a budding filmmaker, for instance, harnessed the company's assets to craft an award-winning short film in mere weeks as opposed to the customary months. Another independent artist, who once invested extensive time and resources into detailed modeling, found that KitBash3D not only reduced his environment creation time by a staggering 70% but also led to significant cost savings. Creating a detailed 3D asset from scratch can be an expensive endeavor, often costing upwards of thousands of dollars depending on complexity. However, by kit bashing pre-existing assets from KitBash3D, creators can achieve savings of up to 80% or more compared to traditional asset creation methods.
Notable Mentions
Big Medium Small - Similar to KitBash3D, Big Medium Small offers stunning kits and characters available to filmmakers.
World Creator - A short film showing an absolutely cinematic mountain landscape. Another powerful procedural that can create absolutely photo realistic landscapes.
Brushify - A procedural painter integrated directly into Unreal Engine that competes with Errant Worlds and World Builder.
Brushify Floating Islands - create Avatar’s World in minutes
CityBLD - A procedural city builder that enables a single person to build an entire city in less than a day.
Creative Interactive Scenes
A digital scene, no matter how meticulously crafted with 3D objects, remains hollow and lifeless without interactivity. It's the dynamic elements—the rustling of grass in the wind, the glistening sheen on streets after a rain, or the sound of footsteps echoing on a wooden floor—that truly breathe life into a setting. Without these immersive touches, the environment feels sterile, like a pristine diorama untouched by the passage of time or the influence of external forces. True immersion arises when objects respond naturally to interactions, like a vase that topples when brushed against or the shifting textures of a pathway as it dries post-rain. Without this vital layer of dynamism, the setting remains a "dead" tableau, a mere backdrop rather than a living, breathing world.
A scene only truly awakens and resonates with viewers when it mirrors the unpredictable and interactive nature of the real world.
Crafting an interactive digital scene is among the most daunting challenges in the realm of digital arts. Beyond just the internal coherence of a virtual environment, artists grapple with the monumental task of integrating these digital domains with tangible real-world spaces. Take, for instance, actors performing on virtual sets. Ensuring harmony between elements like light, wind, and rain across both digital and physical realms demands intricate attention to detail. Today, this integration is achieved manually, and while it's possible to attain high degrees of accuracy with ample time, resources, and funding, this approach isn't scalable. What the industry is on the cusp of embracing is the power of AI-driven tools that can automate these interactions. Such tools, trained on real-world scenarios, promise to revolutionize set design by bridging the gap between digital artistry and tangible realities seamlessly. Yet, the development of these cutting-edge tools remains in its infancy, and their evolution to full maturity is anticipated to span the next five to ten years.
A final thought…
History often provides a lens through which we can glimpse the trajectory of emerging technologies. Drawing parallels from past technological epochs gives us a vantage point to predict the evolution of digital set creation. Yet, for any significant transformation to materialize in terms of cost, time, and resource efficiency, the innovations in this domain must not operate in silos. They must be assimilated into the SMURF stack, ensuring seamless integration and user-friendly interfaces. It's only when these conditions are met, akin to Henry Ford's revolutionary approach to automobile manufacturing, that we'll witness a profound shift—ushering in the era of the mass production of digital sets, democratizing their creation and accessibility.
If you're interested in diving deeper into this topic, I invite you to read my "GOT SMURF" post for a comprehensive overview.
Projecting into the future of filmmaking, it's conceivable to imagine a digital counterpart of Hollywood's renowned Studio City - an abstraction that would sit one level about a World Library of individual 3D assets to a Digital Studio an infinite library of pre-build sets. Instead of sprawling physical lots and constructed sets, this digital Studio City would house an expansive library of pre-built digital sets. Just as physical sets in traditional studios are dressed up uniquely for every film, these digital landscapes can be tailored, modified, and dressed to cater to the specific needs of individual films or television episodes. A generic "medieval town," for instance, could be transformed into a bustling market for one film, a desolate post-apocalyptic ghost town for another, or even a snow-covered festive wonderland for a holiday special. With the power of digital tools and advanced VFX, filmmakers will have the agility to rapidly adapt these foundational sets to fit their narrative's unique requirements. This evolution promises not only unparalleled flexibility and customization but also potentially significant cost savings, reshaping the financial and creative dynamics of the entertainment industry.
A THEME TO REMEMBER. Throughout cinematic history, there's a discernible pattern: transitioning from fully customized solutions to reusable platforms tailored for individual narratives. This evolution was evident when the industry shifted from real-world locations to the versatile sets of Studio City in the physical realm. Similarly, in the digital sphere, we may witness a shift from bespoke digital environments to modular, adaptable digital sets. Yet, as always, it's not the technology but the meticulous craftsmanship and attention to detail that will remain the hallmark of quality in storytelling.
When a process becomes 100 times less cost-intensive, time-intensive, or resource-intensive, three notable shifts tend to occur. First, existing films in the production pipeline see enhanced profitability due to reduced overheads. Second, projects previously shelved due to prohibitive costs become feasible and can be greenlit. Lastly, this drastic reduction often catalyzes unforeseen innovations, leading to entirely new methods or genres of filmmaking.
Some food for thought.
Something new might happen - an inversion of the workflow process. Instead of penning a script and subsequently crafting a set to encapsulate the narrative, budget-conscious directors might explore libraries of pre-made digital environments, weaving stories tailored to those existing landscapes. Drawing a parallel to orthogonal software, which reimagines existing content pipelines and spawns entirely new genres, these novel, orthogonal workflows in filmmaking have the potential to redefine storytelling, placing new tools and techniques at the forefront of narrative creation.
Go Down the Rabbit Hole
The SMURF Stack - The SMURF stack, an acronym for Screenplays, Modeling, Unreal Engine, Rendering, and Final Cut, is a framework that amalgamates several free, open-source, and user-friendly software tools, offering a transformative approach to digital filmmaking.
Game of Thrones Sets Part I, and Part II - watch this fascinating background showing how VFX artists transformed a small town in the Game of Thrones Westeros.
KitBash3D - Imagine if the atomic elements of nearly every type of imaginable set were pre-crafted, enabling artists to quickly assemble unique, compelling digital sets for movies.
Big Medium Small - Similar to KitBash3D, Big Medium Small offers stunning kits and characters available to filmmakers.
Errant Worlds - This new procedural and parametric environment tools enables artists to build huge world’s of landscapes in minutes on a single computer.
World Creator - A short film showing an absolutely cinematic mountain landscape. Another powerful procedural that can create absolutely photo realistic landscapes.
Brushify - A procedural painter integrated directly into Unreal Engine that competes with Errant Worlds and World Builder.
Brushify Floating Islands - create Avatar’s World in minutes
CityBLD - A procedural city builder that enables a single person to build an entire city in less than a day.
Dash - Removes all of the complexity of learning Unreal Engine, and offers a single assistant prompt to help anyone build AAA environments.
Promethean AI - AI assisted everything for managing kit bash kits, 3D assets, and placing those assets inside photo realistic scenes.