The LiSSS is a corpus for evaluating emotions Spanish in literary texts. The LiSSS corpus is constituted by clusters of related literary sentences, phrases and paragraphs along with five reference emotions selected by human annotators:
We have collected literary (poetic, narration, stories, etc.) Spanish sentences and phrases from books or Internet. Each cluster is composed of related sentences describing a specific emotion about Love, Fear, Happiness, Anger and Sadness/Pain (see our paper for more details). The corpus is coded using an emotion code: (A,L,F,H,S). A sentence may belong to several clusters.
The text format is: unique ID, emotion code, the sentence and the author (tabs are separators). By example:
1A El odio es la venganza de un cobarde intimidado. # George Bernard Shaw
2A El desprecio debe ser el más misterioso de nuestros sentimientos. # Antoine de Rivarol
3LS Lo único que me duele de morir, es que no sea de amor. # Gabriel García Márquez
4F Cuando se viaja en avión solamente existen dos clases de emociones: el aburrimiento y el terror. # Orson Welles
5A La ira es una locura corta. # Horacio
6A El odio es el invierno del corazón. # Victor Hugo
7AF Intenta no ocupar tu vida en odiar y tener miedo. # Stendhal
8HS No somos felices: nuestra felicidad es el silencio de la desgracia. # Jules Renard
9S He cometido el peor pecado que un hombre puede cometer. No he sido feliz # Jorge Luis Borges
10H Felicidad no es hacer lo que uno quiere, sino querer lo que uno hace. # Jean Paul Sartre
11A Si los hombres se odian, nada se puede hacer. # José Saramago
In XML format each champ is separated using suitable xml tags.
The corpus has been single-annotated (more than 42 000 tokens but the number of sentences increase as the new versions go on) or multi-annotated (several annotators, two voting strategies, but the number of annotators may increase). Versions tagged using Freeling 4.1 are also availables.
The LiSSS corpus in formats XML/text (encoding utf8, GNU/Linux end-of-line) is distributed under LGPL license. New versions, with more literary sentences, phrases and annotations will be aggregated periodically.
Multi-annotated corpora (202 authors, 500 sentences; output/tagged Freeling 4.1; XML/text format)
Single-annotated corpora (output/tagged Freeling 4.1; XML/text format)
How to cite this corpus? If you use LiSSS corpus, please cite:
@article{lisss:2020:torres+moreno,
author = {Juan-Manuel Torres-Moreno and Luis-Gil Moreno-Jim{\'{e}}nez},
title = {LiSSS: A toy corpus of Spanish Literary Sentences for Emotions detection },
journal = {arXiv},
volume = {2005.08223v1},
year = {2020}
}
@conference{lisss:2020:moreno+torres,
author = {Luis-Gil Moreno-Jim{\'{e}}nez and Juan-Manuel Torres-Moreno},
title = {LISS: A Corpus of Literary Spanish Sentimental Sentences for Emotions Detection},
journal = {CILCC},
pages = {63--68},
year = {2020}
}