In November 2020, I worked on my first fully green-screen CG video. The project, a no-budget venture to expand my portfolio, required a 3D model of the performer to achieve certain shots and ensure proper data collection. With limited resources and a tight deadline for a New Year’s release, I turned to an emerging AI tool that could generate rough 3D models from a single photograph.
While the AI tool offered promise, it was far from perfect. It produced low-detail models, which were only usable in specific conditions, such as distance or out-of-focus shots. Additionally, in its early stages of development, the tool required manual compilation and lacked user-friendly software interfaces. My task was not just to use this tool but to understand how it worked to troubleshoot and optimize its outputs.
I began by photographing the performer in a T-pose against a white cyclorama background, aiming for a straightforward setup to aid rigging. However, despite experimenting with various color corrections, the AI consistently failed to produce a usable 3D model.
Recognizing that the AI’s training data likely comprised images of people in natural poses and environments, I deduced that the T-pose might be too unnatural for the model to process effectively. To test this theory, I used Photoshop to reposition the performer’s arms into a natural resting pose and replaced her open hands with images of fists, making the photo align more closely with the kind of data the AI had likely been trained on.
This adjustment worked. The AI successfully generated a 3D model that met the project’s needs. This outcome demonstrated the importance of not just applying AI tools but also understanding their underlying mechanics. By reverse-engineering the model’s limitations and adapting my input accordingly, I was able to overcome challenges and maximize the tool’s potential.
This experience reinforced the value of critical thinking and problem-solving when working with AI in VFX. It’s not enough to use these tools at face value—understanding how they function can unlock their full capabilities, even in resource-limited scenarios.