PixelSmash is a vulnerability in the FFmpeg framework that can be exploited via crafted media files for remote code execution ...
Abstract: Dense video captioning aims to localize and caption all events in arbitrary untrimmed videos. Although previous methods have achieved appealing results, they still face the issue of temporal ...
We propose MultiTalk, a novel framework for audio-driven multi-person conversational video generation. Given a multi-stream audio input, a reference image and a prompt, MultiTalk generates a video ...
Customized video generation aims to produce videos featuring specific subjects under flexible user-defined conditions, yet existing methods often struggle with identity consistency and limited input ...