--- base_model: google/mt5-xxl license: apache-2.0 language: - it --- # modafact-ita `modafact-ita` is a sequence-to-sequence fine-tuned model for joint **event Factuality and Modality** detection in **Italian**. The model was fine-tuned on [ModaFact](https://huggingface.co/datasets/dhfbk/modafact-ita), a dataset manually annotated with Factuality and Modality values, using [mT5-xxl](https://huggingface.co/google/mt5-xxl) as a base model. ## Model Details ### Model Description - **Developed by:** DH Group @ FBK - **Model type:** Sequence-to-sequence - **Language(s) (NLP):** Italian - **License:** apache-2.0 - **Finetuned from model:** [google/mt5-xxl](https://huggingface.co/google/mt5-xxl) ### Model Sources - **Inference script:** if you want to use the model for inference, please refer to our [github repo](https://github.com/dhfbk/ModaFact). - **Paper:** [ModaFact: Multi-paradigm Evaluation for Joint Event Modality and Factuality Detection]([https://aclanthology.org/2025.coling-main.425/]) ## Uses The model can be used to detect event Factuality and Modality values. If you want to tag your own text, please refer to the [inference script on our github repo](https://github.com/dhfbk/ModaFact). The model takes in input one sentence at a time, for example: ```Per chiarire la questione la Santa Sede autorizzò il prelievo di campioni del legno che vennero datati attraverso l'utilizzo del metodo del carbonio-14.``` and outputs a sequence of span=labels, in this format: ```chiarire=POSSIBLE-POS-FUTURE-FINAL | autorizzò=CERTAIN-POS-PRESENT/PAST | prelievo=UNDERSPECIFIED-POS-FUTURE-CONCESSIVE | datati=CERTAIN-POS-PRESENT/PAST | utilizzo=CERTAIN-POS-PRESENT/PAST``` ## Training Details ### Training Data [https://huggingface.co/datasets/dhfbk/modafact-ita](https://huggingface.co/datasets/dhfbk/modafact-ita) ## Citation If you use or refer to ModaFact, please consider citing this paper: ``` @inproceedings{rovera-etal-2025-modafact, title = "{M}oda{F}act: Multi-paradigm Evaluation for Joint Event Modality and Factuality Detection", author = "Rovera, Marco and Cristoforetti, Serena and Tonelli, Sara", editor = "Rambow, Owen and Wanner, Leo and Apidianaki, Marianna and Al-Khalifa, Hend and Eugenio, Barbara Di and Schockaert, Steven", booktitle = "Proceedings of the 31st International Conference on Computational Linguistics", month = jan, year = "2025", address = "Abu Dhabi, UAE", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2025.coling-main.425/", pages = "6378--6396", abstract = "Factuality and modality are two crucial aspects concerning events, since they convey the speaker`s commitment to a situation in discourse as well as how this event is supposed to occur in terms of norms, wishes, necessity, duty and so on. Capturing them both is necessary to truly understand an utterance meaning and the speaker`s perspective with respect to a mentioned event. Yet, NLP studies have mostly dealt with these two aspects separately, mainly devoting past efforts to the development of English datasets. In this work, we propose ModaFact, a novel resource with joint factuality and modality information for event-denoting expressions in Italian. We propose a novel annotation scheme, which however is consistent with existing ones, and compare different classification systems trained on ModaFact, as a preliminary step to the use of factuality and modality information in downstream tasks. The dataset and the best-performing model are publicly released and available under an open license." } ```