Deepfake Webcam in Real-Time with AI - 2021

Background

So basically, Zoom classes.

Hundreds of days into quarantine, and I am so tired. As expected of every high school student locked in their room all day.

Anyways, so I pass the time with memes, and for months I was obsessed with these memes deepfaking famous internet people into singing the song Baka Mitai from the game Yakuza 0. This meme was insanely popular, you can read more about it on Know Your Meme.

I desperately want to make this with my friends, so I eventually stumble into a guide on how to make it, using the First Order Model AI to do deepfakes.

My Reign of Terror

Yeah, so my friends weren't happy when I sent them videos of their deepfakes singing "dame da ne".

However, I realized that while I was generating the deepfakes, I was getting 20-25 FPS. I then thought, man that's not bad, my GTX 1060 6GB is tearing through this pretty well. The quality is low but it's as good as Zoom calls, could I make this run real-time with webcam input and create a virtual webcam output?

Planning this out

I needed to do the following:

  • Somehow get webcam input. This shouldn't be too hard, I hear FFmpeg can help me
  • Transform this into a numpy tensor or whatever the AI reads as input
  • Transform the tensor output to something FFmpeg-friendly, like a u8 stream
  • Somehow turn that into a virtual webcam output???

numpy's Terrible Tensors

Okay, so I was able to use an FFmpeg command to stream webcam input and turn it into a u8 stream, which I can parse with whatever Python magic. Thankfully, there's a Python library for FFmpeg and streams, so I was able to literally pipe it to big arrays.

But there's one problem.

The AI wants a numpy tensor, and I have zero clue how to do that.

After much pain and struggling...

I figured it out! I don't have photos of this, unfortunately, but I was outputting to some window for testing. I am now able to pass in webcam data into the AI and get a result, hooray.

Webcam Output

Alright, I successfully deepfaked myself into my teacher, what next? Oh yeah, putting it in Zoom.

So, as it turns out, I was using Droidcam for using an old phone as a webcam, and it turns out that Droidcam uses the v4l2loopback kernel module for emulating a webcam. How wonderful!

Now, after finagling with it, realizing that /dev/video3 is the input for my script and /dev/video5 is the output to Zoom, I just had to set the permission and I was done. And man, this was super cool. Showing up to Zoom class with my teacher's face. They knew me as the one with the funny Zoom photos, so they didn't mind, it was a little something to pass the time.

Source Code

https://github.com/Noorquacker/deepfake-webcam/

Related Articles

Notes Website - 2021

RGB LED Controller - 2016