QVHighlights Evaluation and Codalab Submission
==================

### Task Definition
Given a video and a natural language query, our task requires a system to retrieve the most relevant moments in the video, and detect the highlightness of the clips in the video. 

### Evaluation
At project root, run
```
bash standalone_eval/eval_sample.sh 
```
This command will use [eval.py](eval.py) to evaluate the provided prediction file [sample_val_preds.jsonl](sample_val_preds.jsonl), 
the output will be written into `sample_val_preds_metrics.json`. 
The content in this generated file should be similar if not the same as [sample_val_preds_metrics_raw.json](sample_val_preds_metrics_raw.json) file.

### Format

The prediction file [sample_val_preds.jsonl](sample_val_preds.jsonl) is in [JSON Line](https://jsonlines.org/) format, each row of the files can be loaded as a single `dict` in Python. Below is an example of a single line in the prediction file:
```
{
  "qid": 2579,
  "query": "A girl and her mother cooked while talking with each other on facetime.",
  "vid": "NUsG9BgSes0_210.0_360.0",
  "pred_relevant_windows": [
    [0, 70, 0.9986],
    [78, 146, 0.4138],
    [0, 146, 0.0444],
    ...
  ],  
  "pred_saliency_scores": [-0.2452, -0.3779, -0.4746, ...]
}

```


| entry | description |
| --- | ----|
| `qid` | `int`, unique query id |
| `query` | `str`, natural language query, not used by the evaluation script | 
| `vid` | `str`, unique video id | 
| `pred_relevant_windows` | `list(list)`, moment retrieval predictions. Each sublist contains 3 elements, `[start (seconds), end (seconds), score]`| 
| `pred_saliency_scores` | `list(float)`, highlight prediction scores. The higher the better. This list should contain a score for each of the 2-second clip in the videos, and is ordered. | 


### Codalab Submission
To test your model's performance on `test` split, 
please submit both `val` and `test` predictions to our 
[Codalab evaluation server](https://codalab.lisn.upsaclay.fr/competitions/6937). 
The submission file should be a single `.zip ` file (no enclosing folder) 
that contains the two prediction files 
`hl_val_submission.jsonl` and `hl_test_submission.jsonl`, each of the `*submission.jsonl` file 
should be formatted as instructed above.