Skip to content

Questions about data preprocessing #52

@asm3242

Description

@asm3242

Looking at run_detect_segment, we can guess that it requires an annotation file for the video, and that file consists of a start time, an end time, and a text prompt. (The text prompt is not used in the code.)

I wonder if these annotations are created manually, or if they can be created automatically.

Also, when extracting features through the CLIP encoder in the run_clip_filtering file, what text input is required?

Finally, when will the pre-training dataset be released?

Thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions