Abstract: Unlike traditional single-modal visual tracking, vision-language tracking leverages textual descriptions to provide semantic understanding, aiding in handling challenges such as appearance ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results