WebNLG2020 challenge final results - Human Evaluation

WebNLG RDF2Text

Results are ordered alphabetically

English

AllDataCoverage (ALL)  Relevance (ALL)  Correctness (ALL)  TextStructure (ALL)  Fluency (ALL)  
SystemsRankAvg. ZAvg. RawRankAvg. ZAvg. RawRankAvg. ZAvg. RawRankAvg. ZAvg. RawRankAvg. ZAvg. Raw
AmazonAI / id1810.22294.39310.21495.19610.24893.53110.30592.95110.32690.286
BASELINE10.1792.89210.16193.78410.1991.79420.03987.430.01182.43
BASELINE201720.12792.06620.11392.58820.1390.1382-0.06485.7374-0.14380.941
bt5 / id520.16193.83610.18495.2210.22493.58310.23691.91420.21888.688
cuni / id220.15593.29110.16494.55510.16191.58710.20890.75220.18587.642
CycleGT / id2830.02391.23110.12593.3720.07189.84620.04587.87930.07284.82
DANGNT-SGU / id1510.25995.31510.18594.85610.17992.4893-0.20383.5014-0.16178.594
FBConvAI / id34*20.15193.16920.11793.89810.20692.710.31993.08910.32790.837
Huawei / id174-0.3184.7433-0.42585.2653-0.38980.763-0.37380.2195-0.36975.205
NILC / id214-0.47781.6053-0.49983.5223-0.58976.7023-0.40280.4635-0.40874.851
NUIG-DSI / id2320.11692.06310.16194.06110.18992.05310.25891.58820.23388.898
ORANGE-NLG / id135-0.55479.9594-0.7179.8874-0.66874.9773-0.33880.4625-0.33275.675
OSU_Neural_NLG / id3010.23595.12310.16394.61510.22493.40910.28992.43810.32390.066
RALI / id1210.27295.20410.17194.8110.16392.1283-0.28581.8354-0.24177.759
REF10.25195.44210.13994.39210.25694.14910.25492.10510.27989.846
TGen / id43-0.07588.17610.13292.6420.07488.62610.16889.04120.18286.163
UPC-POE / id146-0.78275.8454-0.53182.0514-0.70174.3744-0.45678.5035-0.50872.28
Seen DomainsDataCoverage (Seen)  Relevance (Seen)  Correctness (Seen)  TextStructure (Seen)  Fluency (Seen)  
SystemsRankAvg. ZAvg. RawRankAvg. ZAvg. RawRankAvg. ZAvg. RawRankAvg. ZAvg. RawRankAvg. ZAvg. Raw
AmazonAI / id1810.25894.0910.1793.58610.29593.69110.29391.15410.30887.75
BASELINE10.2895.29610.15394.56810.22693.59320.07487.0420.0382.664
BASELINE201720.06590.2532-0.04389.56820.04287.6083-0.1682.8923-0.40675.037
bt5 / id510.19694.4610.22295.16710.31294.84310.26491.84610.2889.892
cuni / id210.25794.94110.20394.8710.27393.88610.26391.42910.28189.454
CycleGT / id283-0.13788.38610.12592.1220.06288.6333-0.12184.2622-0.03683.287
DANGNT-SGU / id1510.23994.36710.16493.59620.1490.7723-0.13284.6913-0.15979.559
FBConvAI / id34*10.17893.54320.11293.11110.26193.47210.32692.96610.35891.654
Huawei / id1720.10192.17320.01192.22220.0890.26920.06788.3820.06485.111
NILC / id2110.22594.44810.26696.26910.21293.07110.21291.22520.15587.306
NUIG-DSI / id2320.05991.25310.17894.51220.16292.49410.23490.74410.1888.611
ORANGE-NLG / id1320.10992.59320.11293.67320.14591.47820.07488.03420.11285.302
OSU_Neural_NLG / id3010.17694.28720.08493.37310.23394.01510.23991.59910.25388.651
RALI / id1210.27493.84610.14893.04920.19891.4233-0.1782.6143-0.15779.238
REF10.26495.49110.13594.14210.23693.35510.19891.22510.22588.136
TGen / id43-0.39481.6720.07491.0992-0.02886.79320.03486.88620.00583.037
UPC-POE / id144-0.40482.1732-0.01990.5033-0.11584.85820.09687.3092-0.07780.577
Unseen EntitiesDataCoverage (Unseen E.)  Relevance (Unseen E.)  Correctness (Unseen E.)  TextStructure (Unseen E.)  Fluency (Unseen E.)  
SystemsRankAvg. ZAvg. RawRankAvg. ZAvg. RawRankAvg. ZAvg. RawRankAvg. ZAvg. RawRankAvg. ZAvg. Raw
AmazonAI / id1810.29195.53210.27296.32910.29394.70310.34894.28810.45293.365
BASELINE10.16193.3610.27196.09910.19992.6352-0.01188.24320.02582.126
BASELINE201710.19692.20710.26693.79710.31791.30220.00385.64420.03983.604
bt5 / id510.15893.73420.14695.35110.15493.23910.26391.76610.31889.595
cuni / id210.20693.93710.21394.99510.14991.08610.31493.24320.09886.559
CycleGT / id2820.09492.70310.18195.19810.17192.54110.23392.03610.28689.189
DANGNT-SGU / id1510.2395.32910.24996.65810.1693.4592-0.24581.9772-0.11678.599
FBConvAI / id34*20.13993.53620.16995.64410.2019410.33494.40510.36591.599
Huawei / id172-0.25985.0413-0.36685.5592-0.24284.1262-0.20483.3832-0.22179.315
NILC / id213-0.34384.8963-0.29988.232-0.56378.3153-0.62978.553-0.49274.36
NUIG-DSI / id2310.16591.75210.18193.69410.2692.44610.35893.04110.30389.577
ORANGE-NLG / id133-0.62478.1493-0.69778.953-0.73871.3422-0.28580.5053-0.26374.586
OSU_Neural_NLG / id3010.15894.20310.25395.66210.17892.33810.29292.48210.38591.293
RALI / id1210.31795.77510.19495.33810.21593.3332-0.08886.5592-0.01983.14
REF10.28395.99110.31597.11710.26895.17110.28193.18910.28590.788
TGen / id410.20993.64910.23495.35610.2391.88310.34792.34710.44891.869
UPC-POE / id144-0.98272.9283-0.49782.953-0.67875.0323-0.48277.8293-0.38374.095
Unseen DomainsDataCoverage (Unseen D.)  Relevance (Unseen D.)  Correctness (Unseen D.)  TextStructure (Unseen D.)  Fluency (Unseen D.)  
SystemsRankAvg. ZAvg. RawRankAvg. ZAvg. RawRankAvg. ZAvg. RawRankAvg. ZAvg. RawRankAvg. ZAvg. Raw
AmazonAI / id1810.1794.09810.21595.71310.20192.93310.29593.49810.28490.55
BASELINE20.10691.20120.1292.31210.16390.3220.03987.2642-0.00782.414
BASELINE201720.13793.1320.14593.94820.10591.2132-0.03287.5422-0.05783.473
bt5 / id520.1493.49210.17795.19710.292.94810.20792.01920.13787.556
cuni / id220.06991.99220.11994.17220.09690.37420.12989.27210.16386.979
CycleGT / id2820.09292.37220.10293.36820.03589.45220.06988.35620.04883.914
DANGNT-SGU / id1510.28495.89710.17294.87210.2193.1423-0.22983.413-0.18277.992
FBConvAI / id34*20.1492.7820.09893.64410.17391.66910.30892.60510.29390.006
Huawei / id174-0.58680.0043-0.72180.8223-0.74373.4274-0.71873.8084-0.767.308
NILC / id215-0.9772.2344-1.0673.6094-1.09865.8564-0.68574.5984-0.72167.33
NUIG-DSI / id2320.1392.69720.14293.93710.17591.61310.2391.49410.23788.787
ORANGE-NLG / id135-0.93572.8874-1.22571.7284-1.14366.284-0.61775.7434-0.63770.163
OSU_Neural_NLG / id3010.30396.03310.17394.94110.23793.48910.31992.94110.3490.423
RALI / id1210.25395.80510.17695.67810.11992.0543-0.44179.3433-0.38774.554
REF10.2395.17820.06693.38910.26394.20710.27792.1910.3190.508
TGen / id430.00289.88720.12592.44320.07188.37910.17588.97310.17785.676
UPC-POE / id145-0.93273.1573-0.86376.4234-1.07567.5864-0.78673.3244-0.82866.358

Russian

All

AllDataCoverage (ALL)  Relevance (ALL)  Correctness (ALL)  TextStructure (ALL)  Fluency (ALL)  
SystemsRankAvg. ZAvg. RawRankAvg. ZAvg. RawRankAvg. ZAvg. RawRankAvg. ZAvg. RawRankAvg. ZAvg. Raw
BASELINE10.293.1912-0.07991.2944-0.38780.833-0.2787.6453-0.24784.691
bt5 / id810.31295.6310.17495.38510.3495.59410.21995.74510.23293.088
cuni_ufal / id310.20393.15510.07793.30620.10190.38210.21896.07310.21392.921
FBConvAI / id3710.13392.33920.02793.49120.0890.77920.07993.76420.06390.248
Huawei / id252-0.18986.4482-0.0691.7612-0.08487.0333-0.18389.5153-0.17485.679
OSU_Neural_NLG / id272-0.42282.8362-0.18290.4333-0.18184.8320.01992.9582-0.0588.558
med / id93-0.46782.232-0.02292.22420.02188.5852-0.07791.3092-0.0688.252
rus_REF10.239420.06593.63620.10990.632-0.00592.08220.02289.021