Final Results for the TDSC-ABUS 2023 Challenge

winners:

Task Segmentation: POA

Task Classification: Shiontao

Task Detection: Shiontao

Overall: Deadluck

Task Segmentation with Fixed Penalization: Nvauto

Overall with Fixed Penalization: Shiontao

1. Results For Each Metric

We ran the submission Docker using the following command:

docker load < teamname.tar.gz

docker run --gpus "device=0" --name teamname --rm -v \
  /home/xxx/tdsc/Test/DATA:/input:ro -v $(pwd)/predict:/predict --shm-size 8g teamname:latest

And obtained all the results using the code available on GitHub:
https://github.com/PerceptionComputingLab/TDSC-ABUS2023/tree/main/Final_Evaluation

All results are shown in the following table. The '-' symbol indicates that there are no results under the particular folder.

Rank
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

Team

Status

DICE

HD

ACC

AUC

FROC

Short Paper

seg.csv

cls.csv

det.csv

Blackbean Succeed - - - - - No - - -
Deadluck Succeed 0.5616 162.9371 0.7286 0.7733 0.7704 Yes seg.csv cls.csv det.csv
Discerning Tumor Failed - - - - - Yes - - -
Dolphins Succeed 0.4665 266.1207 - - - Yes seg.csv - -
Eureka Succeed 0.4981 153.0743 0.6000 0.6425 0.6441 Yes seg.csv cls.csv det.csv
FathomX Succeed 0.5400 121.1640 0.5429 0.5675 0.6153 Yes seg.csv cls.csv det.csv
Infertdsc Succeed 0.3057 203.4005 - - - Yes seg.csv - -
Mispl Succeed 0.5342 inf 0.7143 0.7642 0.0000 Yes seg.csv cls.csv det.csv
Nvauto Succeed 0.6020 inf - - - Yes seg.csv - -
POA Succeed 0.6147 90.5339 0.6429 0.6558 0.7303 Yes seg.csv cls.csv det.csv
Sante2024 Succeed 0.5377 96.5050 0.5429 0.5775 0.6383 Yes seg.csv cls.csv det.csv
Shiontao Succeed 0.5861 inf 0.7571 0.8892 0.8468 Yes seg.csv cls.csv det.csv
SMART Failed - - - - - Yes - - -
Smcnscp Succeed - - - - 0.5327 Yes - - det.csv
Strollers Succeed 0.4412 101.4036 - - 0.3913 Yes seg.csv - det.csv
Sunggukyung Succeed - - 0.6857 0.6842 - No - cls.csv -
UCLA CDX Succeed 0.0000 inf - - - Yes seg.csv - -
Vicorob Succeed 0.5853 80.1817 - - 0.6459 Yes seg.csv - det.csv
Flamingo Succeed 0.5890 inf 0.7429 0.7708 0.6067 Yes seg.csv cls.csv det.csv
Zhaoqiaochu Succeed 0.4890 81.7367 - - - Yes seg.csv - -
walltall Failed - - - - - No - - -

2. Segmentation Rank

The segmentation task involves two metrics: the DICE coefficient and the Hausdorff distance (HD).
Initially, we eliminated teams that did not provide valid results.
Subsequently, we normalized the remaining teams' scores using min-max normalization,
calculated as (x - min(x)) / (max(x) - min(x)). Since lower HD scores are better,
we computed the final result as (1 + Norm_DICE - Norm_HD) / 2.

Rank
1
2
3
4
5
6
7
8
9
10

Team

DICE

Norm_DICE

HD

Norm_HD

Seg_Score

Deadluck 0.5616 0.8283 162.9371 0.4451 0.6916
Dolphins 0.4665 0.5204 266.1207 1 0.2602
Eureka 0.4981 0.6227 153.0743 0.3920 0.6154
FathomX 0.5400 0.7583 121.1640 0.2204 0.7689
Infertdsc 0.3057 0.0000 203.4005 0.6627 0.1687
POA 0.6147 1.0000 90.5339 0.0557 0.9722
Sante2024 0.5377 0.7509 96.5050 0.0878 0.8316
Strollers 0.4412 0.4386 101.4036 0.1141 0.6622
Vicorob 0.5853 0.9050 80.1817 0 0.9525
Zhaoqiaochu 0.4890 0.5933 81.7367 0.0084 0.7925

3. Segmentation Rank with Fixed Penalization for Inf HD

Some teams received an 'inf' result in the HD metric, presenting a significant challenge. Addressing these 'inf' results has been a topic of much debate. A common solution is to penalize them with a fixed value. However, selecting an appropriate value can be subjective and might compromise fairness for teams that produce robust results. Conversely, simply excluding teams with 'inf' results doesn't accurately represent the performance of all teams. After careful consideration, we've decided to rank two boards separately. In this ranking, we've replaced the 'inf' result with the worst HD score from all valid results for each case, multiplied by 105%. The normalization process and the overall calculation formula remain unchanged. It's evident that different penalization approaches can influence final rankings significantly. Hence, we've designated the aforementioned leaderboard (which employs elimination of the 'inf' scores) as the primary board. We'll provide certificates based on this primary leaderboard, with no associated cash rewards.

Rank
1
2
3
4
5
6
7
8
9
10
11
12
13
14

Team

DICE

Norm_DICE

HD

Norm_HD

Seg_Score

Deadluck 0.5616 0.8283 162.9371 0.4451 0.6916
Mispl 0.5342 0.7395 105.0751 0.1339 0.8028
Nvauto 0.6020 0.9590 82.8654 0.0144 0.9723
Shiontao 0.5861 0.9075 117.1939 0.1991 0.8542
Flamingo 0.5890 0.9169 159.0311 0.4241 0.7464
Dolphins 0.4665 0.5204 266.1207 1 0.2602
Eureka 0.4981 0.6227 153.0743 0.3920 0.6154
FathomX 0.5400 0.7583 121.1640 0.2204 0.7689
Infertdsc 0.3057 0.0000 203.4005 0.6627 0.1687
POA 0.6147 1.0000 90.5339 0.0557 0.9722
Sante2024 0.5377 0.7509 96.5050 0.0878 0.8316
Strollers 0.4412 0.4386 101.4036 0.1141 0.6622
Vicorob 0.5853 0.9050 80.1817 0 0.9525
Zhaoqiaochu 0.4890 0.5933 81.7367 0.0084 0.7925

4. Classification Rank

For the classification task, we eliminated teams that did not provide valid results.
We then normalized the remaining teams' scores using min-max normalization.
The final result is calculated as (Norm_ACC + Norm_AUC) / 2.

Rank
1
2
3
4
5
6
7
8
9

Team

ACC

Norm_ACC

AUC

Norm_AUC

Cls_Score

Deadluck 0.7286 0.8667 0.7733 0.6399 0.7533
Eureka 0.6000 0.2667 0.6425 0.2332 0.2499
FathomX 0.5429 0.0000 0.5675 0.0000 0.0000
Mispl 0.7143 0.8000 0.7642 0.6114 0.7057
POA 0.6429 0.4667 0.6558 0.2746 0.3706
Sante2024 0.5429 0.0000 0.5775 0.0311 0.0155
Shiontao 0.7571 1.0000 0.8892 1.0000 1.0000
Sunggukyung 0.6857 0.6667 0.6842 0.3627 0.5147
Flamingo 0.7429 0.9333 0.7708 0.6321 0.7827

5. Detection Rank

The evaluation of the detection performance utilizes the Free-Response Receiver Operating Characteristic (FROC) metric.
The FROC performance is presented in terms of sensitivities at different false positive (FP) levels. Specifically,
the average sensitivity at FP rates of 0.125, 0.25, 0.5, 1, 2, 4, and 8
is employed as the primary evaluation metric for assessing the detection performance.
These values are subsequently subjected to min-max normalization.

Rank
1
2
3
4
5
6
7
8
9
10

Team

FROC

Det_Score

Deadluck 0.7704 0.8323
Eureka 0.6441 0.5550
FathomX 0.6153 0.4918
POA 0.7303 0.7442
Sante2024 0.6383 0.5423
Shiontao 0.8468 1.0000
Smcnscp 0.5327 0.3104
Strollers 0.3913 0.0000
Vicorob 0.6459 0.5589
Flamingo 0.6067 0.4729

6. Overall Rank

The overall performance is determined by considering only the teams that have submitted valid results for all metrics.
We apply similar normalization methods as before, but exclusively for teams with valid results.
The final overall result is obtained by:
(1 + Norm_DICE - Norm_HD) / 2 + (Norm_ACC + Norm_AUC) / 2 + Norm_FROC

Rank
1
2
3
4
5

Team

DICE

Norm_Dice

HD

Norm_HD

ACC

Norm_ACC

AUC

Norm_AUC

FROC

Norm_FROC

Overall

Deadluck 0.5616 0.5449 162.9371 1.0000 0.7286 1.0000 0.7733 1.0000 0.7704 1.0000 2.2724
Eureka 0.4981 0.0000 153.1000 0.8638 0.6000 0.3077 0.6425 0.3644 0.6441 0.1857 0.5898
FathomX 0.5400 0.3593 121.2000 0.4230 0.5429 0.0000 0.5675 0.0000 0.6153 0.0000 0.4681
POA 0.6147 1.0000 90.5300 0.0000 0.6429 0.5385 0.6558 0.4291 0.7303 0.7415 2.2253
Sante2024 0.5377 0.3397 96.5100 0.0825 0.5429 0.0000 0.5775 0.0486 0.6383 0.1483 0.8012

7. Overall Rank with Fixed Penalization for Inf HD

Similar to the Segmentation task, we've substituted the 'inf' result with the worst HD score from all valid results for each case, multiplied by 105%. This allows us to rank teams that have an 'inf' result. Please note, this board will only offer certificate rewards too.

Rank
1
2
3
4
5
6
7

Team

DICE

Norm_Dice

HD

Norm_HD

ACC

Norm_ACC

AUC

Norm_AUC

FROC

Norm_FROC

Overall

deadluck 0.5616 0.5449 162.9371 1.0000 0.7286 0.8667 0.7733 0.6399 0.7704 0.6818 1.7075
poa 0.6147 1.0000 90.5339 0.0000 0.6429 0.4667 0.6558 0.2746 0.7303 0.5148 1.8854
Sante2024 0.5377 0.3397 96.5050 0.0825 0.5429 0.0000 0.5775 0.0311 0.6383 0.1316 0.7758
FathomX 0.5400 0.3593 121.1640 0.4230 0.5429 0.0000 0.5675 0.0000 0.6153 0.0358 0.5039
Eureka 0.4981 0.0000 153.0743 0.8638 0.6000 0.2667 0.6425 0.2332 0.6441 0.1558 0.4738
shiontao 0.5861 0.7547 117.1939 0.3682 0.7571 1.0000 0.8892 1.0000 0.8468 1.0000 2.6932
Flamingo 0.5890 0.7796 159.0311 0.9461 0.7429 0.9333 0.7708 0.6321 0.6067 0.0000 1.1995