Finished Theses
Bachelor's Theses
Master's Theses
Visual Analysis of Humor Assessment in Edited News Headlines
Status: Finished 2023 (thesis report) (LiU DiVA)
Grading: Master's Thesis for Two Students
Area: Information Visualization / Visual Analytics / Text Mining
Supervisor: Dr. Kostiantyn Kucher
Examiner: Prof. Dr. Andreas Kerren
Active Students: Johanna Folde and Elin Akkurt
Content and Tasks:
Identification, prediction, and generation of text with highly subjective and context-dependent properties are important and difficult challenges in computational linguistics. Sarcasm and irony detection are examples of natural language processing (NLP) tasks that are considered challenging, including the issue of agreeing on the consistent annotations/labels for particular sentences or documents among human annotators with respect to such elusive categories.
Computational approaches for identifying and analyzing humor in texts have also been in the focus of the NLP research community, with the shared task (contest) titled "Assessing Humor in Edited News Headlines" recognized as the best task at SemEval-2020 [1]. The respective task provides the data on news headlines in English with minimal edits (e.g., single word replacements) made in order to make the respective headlines humorous [2-4]. The actual level of humor/funniness of such edited headlines was assessed by multiple annotators on an ordinal scale [5-6]. The aim of the contest was to discover better-performing computational methods (e.g., machine learning approaches) that would predict the funniness level or rank two edited versions of a headline. While a number of solutions focusing on such regression and classification tasks were proposed [3], there are further questions to be asked and insights to be discovered within the respective data, for instance, how consistent are the funniness level annotations across the topics, or to which extent can the funniness scores range across several related headlines.
To support answering such questions (rather than focusing on text classification or regression itself), a visual text analytic approach is required, and this is precisely the topic of this thesis project. The prior work on text visualization and visual text analytics [7–8] can be used to guide the design process for this project, however, computational analysis of humor is not widely covered by such prior work. The very recent DeHumor approach by Wang et al. [9], for instance, focuses on in-depth multimodal analyses of comedy performances. In order to support interactive visual analyses of humor in edited news headlines, the challenges of representing and interacting with the data in the respective text genre, multiple annotations, and further possible data facets (such as named entities and topics identified in headlines) will have to be addressed.
Humorous headline annotation interface by Hossain et al. [6]
(taken from here). |
- Programming skills in Python and JavaScript
- Basic knowledge in D3.js and/or Plotly.js
- Basic knowledge in information visualization
- Basic knowledge in natural language processing / text mining
- International Workshop on Semantic Evaluation 2020. https://alt.qcri.org/semeval2020/
- SemEval-2020 Task 7: Assessing Humor in Edited News Headlines. https://competitions.codalab.org/competitions/20970
- Nabil Hossain, John Krumm, Michael Gamon, and Henry Kautz. 2020. SemEval-2020 Task 7: Assessing Humor in Edited News Headlines. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 746–758, Barcelona (online). International Committee for Computational Linguistics. https://doi.org/10.18653/v1/2020.semeval-1.98
- Headline Humor Dataset. https://www.cs.rochester.edu/u/nhossain/humicroedit.html
- Nabil Hossain, John Krumm, and Michael Gamon. 2019. “President Vows to Cut Hair”: Dataset and Analysis of Creative Text Editing for Humorous Headlines. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 133–142, Minneapolis, Minnesota. Association for Computational Linguistics. https://doi.org/10.18653/v1/N19-1012
- Nabil Hossain, John Krumm, Tanvir Sajed, and Henry Kautz. 2020. Stimulating Creativity with FunLines: A Case Study of Humor Generation in Headlines. ArXiV preprint 2002.02031. https://doi.org/10.48550/arXiv.2002.02031
- Kostiantyn Kucher and Andreas Kerren. 2015. Text Visualization Techniques: Taxonomy, Visual Survey, and Community Insights. In Proceedings of the 8th IEEE Pacific Visualization Symposium (PacificVis '15), pages 117-121, https://doi.org/10.1109/PACIFICVIS.2015.7156366
- Mohammad Alharbi and Robert S. Laramee. 2019. SoS TextVis: An Extended Survey of Surveys on Text Visualization. Computers 8, no. 1, article 17. https://doi.org/10.3390/computers8010017
- Xingbo Wang, Yao Ming, Tongshuang Wu, Haipeng Zeng, Yong Wang, and Huamin Qu. 2022. DeHumor: Visual Analytics for Decomposing Humor. IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 12, pages 4609-4623. https://doi.org/10.1109/TVCG.2021.3097709