* If you want to update the article please login/register
In the Vietnamese language, VietnameseMDS is the only openly offered dataset for this task. The dataset has 199 clusters, there are only three records in each collection, which is little compared to common datasets in English. To that end, we hired 29 annotators and enhanced MDSWriter-- an open-source annotation tool, to support the annotators in creating gold conventional summaries. We have verified the dependability of our dataset by utilizing a selection of metrics consisting of traditional Cohen's κ κ, kicked back Cohen's κ κ-- a new metric that we recommend to make it more appropriate for abstractive summarization, and ROUGE scores. At the same time, ROUGE scores are 0. 729 of ROUGE-1, 0. 507 of ROUGE-2 and 0. 524 of ROUGE-SU4. It is noted that unlike previous work that just published the last summarization dataset, we additionally publish intermediate comment results, which can be used in various other NLP troubles such as sentence category.
Source link: https://doi.org/10.1007/s10579-020-09495-4
* Please keep in mind that all text is summarized by machine, we do not bear any responsibility, and you should always check original source before taking any actions