A Large-Scale Multimodal Plot-Oriented Story Understanding Dataset
What does data look like ?
For features based on the new visual encoder, please contact the author via email or github issue.
What can PTVD Do ?
Giving multiple genre tags for a complete plot clip based on parallel multimodal data.
Achieving cross-modal data (e.g., txt-img, txt-vid) matching based on a complete plot.
Generating corresponding high-quality textual plot description for a given plot video clip.
There are tons of tasks that can be done based on TVD, e.g., plot reordering, plot distinguish, etc.
How to use PTVD?
Reference
{ @article{li2023ptvd, title={PTVD: A Large-Scale Plot-Oriented Multimodal Dataset Based on Television Dramas}, author={Chen Li, Xutan Peng, Teng Wang, Yixiao Ge, Mengyang Liu, Xuyuan Xu, Yexin Wang, Ying Shan}, eprint={2306.14644}, year={2023} }
How can you communicate?