verkaDerkaDerk/face-mesh-workflow · Apply for community grant: Personal project (gpu)

The FaceMeshWorkflow project enables users to upload images and utilize Google's MediaPipe API for facial identification. It incorporates additional depth information from Zoe and Midas (DPT), which are combined to create a final mesh.

The main goal of the project is to provide users with control over the mapping process. Each depth estimation source has its own unique characteristics: MediaPipe can sometimes produce generic results, ZoeDepth may occasionally exaggerate features, and Midas might encounter some difficulties.

To aid users in understanding the depth information, grayscale images are used for visualization. Simple sliders allow users to adjust the mapping and view the results in dedicated 3D viewers for each source.

Once users are satisfied with the mappings from different sources, they can precisely control how these mappings contribute to the final mesh. Importantly, all mapping adjustments occur post-inference, ensuring minimal overhead when experimenting with different configurations.

The 3D meshes generated retain texture mapping information and can be easily downloaded. Additionally, a provided script facilitates final tweaks, allowing the mesh to be seamlessly imported into software like Blender as a "clean quad mesh."

The FaceMeshWorkflow has diverse applications, but it's worth noting that CPU execution is slower (around 60 seconds) compared to GPU processing (around 2 seconds).

Considering the need for optimal speed and responsiveness, I kindly request the provision of a small GPU instance to ensure a smooth and efficient workflow.