A comparison of the test-set performance (averaged across tasks) between:
For each task , a comparison of the test-set performance between:
A comparison of the test-set performance for between each task of a multi-task profile model fine-tuned on the best-performing fold