According to the ByteDance researchers, OmniHuman-1 only needs a single reference image and audio, like speech or vocals, to ...