View on GitHub

SemEval-2025

The 19th International Workshop on Semantic Evaluation

SemEval-2025 Schedule

SemEval-2025 will be colocated with ACL 2025 at The Austria Center Vienna.

Online poster sessions will be held on Gather.

All in-person activities will be held in Hall M2. Oral sessions and invited talk will also be streamed on Zoom.

Thursday, July 31st

09:00-09:30 Welcome and Introduction to SemEval: by Workshop Chairs

09:30-10:30 Invited Talk, Emily Allaway: Lessons in generics: how language models grapple with human generalisation

Talk information

The ability to generalise is a crucial aspect of human cognition, allowing us to derive broader understandings from specific instances. In language, generalised knowledge over particular instantiations and exceptions can be flexibly expressed through generics — generalisations without quantifiers. However, the flexibility of generics also comes with puzzling properties that have been extensively studied in areas such as linguistics and philosophy of language. This talk will explore the specific challenges that this language of generalisation poses for language models (LMs). I will begin by examining whether language models recognise the quantificational variation inherent in generics. Specifically, I will discuss how accurately LMs process and recognise the quantification in generic expressions, with a particular focus on the phenomenon of overgeneralisation — unwarranted universal quantification. One critical area of overgeneralisation is with stereotypes and I will touch on the implications for LMs of stereotypes that are expressed as generics. Next, I will present evaluations on the capacity of LMs to reason about generics and related examples, probing LMs’ ability to both maintain and override their generalisations. In the final part of the talk, I will expand the discussion to visual-language models (VLMs) to determine whether their struggles with generics mirror those of traditional LMs and what the broader implications of these findings might be.

Bio: Emily Allaway is a Chancellor’s Fellow at the University of Edinburgh in the School of Informatics, where she is affiliated with both Edinburgh NLP and the Institute for Language, Cognition and Computation (ILCC). Her research is on reasoning about and understanding implicit meaning in language, with a recent focus on generics and their role in reasoning. Emily received her PhD from Columbia University under the supervision of Kathleen McKeown. Her doctoral work there was supported by an NSF Graduate Research Fellowship. Her previous work includes research positions at the University of Washington, the Allen Institute for Artificial Intelligence, and Amazon Science. She is currently a chair for the WiNLP workshop.

10:30-11:00 Coffee break

11:00-12:30 Oral Session I: Tasks 1-6
  • 11:00-11:15 SemEval-2025 Task 1: ADMIRE: Advancing Multimodal Idiomaticity Representation
  • 11:15-11:30 SemEval-2025 Task 2: EA-MT: Entity-Aware Machine Translation
  • 11:30-11:45 SemEval-2025 Task 3: Mu-SHROOM, the Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes
  • 11:45-12:00 SemEval-2025 Task 4: Unlearning sensitive content from Large Language Models
  • 12:00-12:15 SemEval-2025 Task 5: LLMs4Subjects: LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog
  • 12:15-12:30 SemEval-2025 Task 6: PromiseEval: Multinational, Multilingual, Multi-Industry Promise Verification

12:30-14:00 Lunch break

14:00-15:15 Oral Session II: Tasks 7-11
  • 14:00-14:15 SemEval-2025 Task 7: Multilingual and Crosslingual Fact-Checked Claim Retrieval
  • 14:15-14:30 SemEval-2025 Task 8: Question-Answering over Tabular Data
  • 14:30-14:45 SemEval-2025 Task 9: The Food Hazard Detection Challenge
  • 14:45-15:00 SemEval 2025 Task 10: Multilingual Characterization and Extraction of Narratives from Online News
  • 15:00-15:15 SemEval 2025 Task 11: Bridging the Gap in Text-Based Emotion Detection

15:15-15:30 Oral Session III: Best Task Awards Announcement

15:30-16:00 Coffee break

16:00-17:30 Poster Session I (in person only)
  • NotMyNarrative at SemEval-2025 Task 10: Do Narrative Features Share Across Languages in Multilingual Encoder Models?
  • UWBa at SemEval-2025 Task 7: Multilingual and Crosslingual Fact-Checked Claim Retrieval
  • adithjrajeev at SemEval-2025 Task 10: Sequential Learning for Role Classification Using Entity-Centric News Summaries
  • UZH at SemEval-2025 Task 3: Token-Level Self-Consistency for Hallucination Detection
  • UniBuc at SemEval-2025 Task 9: Similarity Approaches to Classification
  • XLM-Muriel at SemEval-2025 Task 11: Hard Parameter Sharing for Multi-lingual Multi-label Emotion Detection
  • Chinchunmei at SemEval-2025 Task 11: Boosting the Large Language Model’s Capability of Emotion Perception using Contrastive Learning
  • UIMP-Aaman at SemEval-2025 Task11: Detecting Intensity and Emotion in Social Media and News
  • BERTastic at SemEval-2025 Task 10: State-of-the-Art Accuracy in Coarse-Grained Entity Framing for Hindi News
  • LTG at SemEval-2025 Task 10: Optimizing Context for Classification of Narrative Roles
  • CCNU at SemEval-2025 Task 3: Leveraging Internal and External Knowledge of Large Language Models for Multilingual Hallucination Annotation
  • MRT at SemEval-2025 Task 8: Maximizing Recovery from Tables with Multiple Steps
  • RACAI at SemEval-2025 Task 7: Efficient adaptation of Large Language Models for Multilingual and Crosslingual Fact-Checked Claim Retrieval
  • ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging
  • GPLSICORTEX at SemEval-2025 Task 10: Leveraging Intentions for Generating Narrative Extractions
  • Hallucination Detectives at SemEval-2025 Task 3: Span-Level Hallucination Detection for LLM-Generated Answers
  • BitsAndBites at SemEval-2025 Task 9: Improving Food Hazard Detection with Sequential Multitask Learning and Large Language Models
  • G-MACT at SemEval-2024 Task 8: Exploring Planning and Tool Use in Question Answering over Tabular Data
  • MyMy at SemEval-2025 Task 9: A Robust Knowledge-Augmented Data Approach for Reliable Food Hazard Detection
  • CCNU at SemEval-2025 Task 8: Enhancing Question Answering on Tabular Data with Two-Stage Corrections
  • SALT 🧂 at SemEval-2025 Task 2: A SQL-based Approach for LLM-Free Entity-Aware-Translation
  • QMUL at SemEval-2025 Task 11: Explicit Emotion Detection with EmoLex, Feature Engineering, and Threshold-Optimized Multi-Label Classification
  • Team INSAntive at SemEval-2025 Task 10: Hierarchical Text Classification using BERT
  • SmurfCat at SemEval-2025 Task 3: Bridging External Knowledge and Model Uncertainty for Enhanced Hallucination Detection
  • ipezoTU at SemEval-2025 Task 7: Hybrid Ensemble Retrieval for Multilingual Fact-Checking
  • ATLANTIS at SemEval-2025 Task 3: Detecting Hallucinated Text Spans in Question Answering
  • Samsung Research Poland at SemEval-2025 Task 8: LLM ensemble methods for QA over tabular data
  • LyS at SemEval 2025 Task 8: Zero-Shot Code Generation for Tabular QA
  • Dataground at SemEval-2025 Task 8: Small LLMs and Preference Optimization for Tabular QA
  • AILS-NTUA at SemEval-2025 Task 8: Language-to-Code prompting and Error Fixing for Tabular Question Answering
  • ITUNLP at SemEval-2025 Task 8: Question-Answering over Tabular Data: A Zero-Shot Approach using LLM-Driven Code Generation

Friday, August 1st

9:00-10:30 Poster Session II: System Description Papers (Online: Gather)
  • VerbaNexAI at SemEval-2025 Task 9: Advances and Challenges in the Automatic Detection of Food Hazards
  • CTYUN-AI at SemEval-2025 Task 1: Learning to Rank for Idiomatic Expressions
  • Anastasia at SemEval-2025 Task 9: Subtask 1, Ensemble Learning with Data Augmentation and Focal Loss for Food Risk Classification
  • UNEDTeam at SemEval-2025 Task 10: Zero-Shot Narrative Classification
  • DUTtask10 at SemEval-2025 Task 10: ThoughtFlow: Hierarchical Narrative Classification via Stepwise Prompting
  • TechSSN3 at SemEval-2025 Task 9: Food Hazard and Product Detection - Category Identification and Vector Prediction
  • CYUT at SemEval-2025 Task 6: Prompting with Precision – ESG Analysis via Structured Prompts
  • Zuifeng at SemEval-2025 Task 9: Multitask Learning with Fine-Tuned RoBERTa for Food Hazard Detection
  • FII the Best at SemEval 2025 Task 2: Steering State-of-the-art Machine Translation Models with Strategically Engineered Pipelines for Enhanced Entity Translation
  • madhans476 at SemEval-2025 Task 9: Multi-Model Ensemble and Prompt-Based Learning for Food Hazard Prediction
  • Sakura at SemEval-2025 Task 2: Enhancing Named Entity Translation with Fine-Tuning and Preference Optimization
  • Anaselka at SemEval-2025 Task 9: Leveraging SVM and MNB for Detecting Food Hazard
  • QUST_NLP at SemEval-2025 Task 7: A Three-Stage Retrieval Framework for Monolingual and Crosslingual Fact-Checked Claim Retrieval
  • Trans-Sent at SemEval-2025 Task 11: Text-based Multi-label Emotion Detection using Pre-Trained BERT Transformer Models
  • Team INSALyon2 at SemEval-2025 Task 10: A Zero-shot Agentic Approach to Text Classification
  • SRCB at SemEval-2025 Task 9: LLM Finetuning Approach based on External Attention Mechanism in The Food Hazard Detection
  • Team QUST at SemEval-2025 Task 10: Evaluating Large Language Models in Multiclass Multi-label Classification of News Entity Framing
  • Advacheck at SemEval-2025 Task 3: Combining NER and RAG to Spot Hallucinations in LLM Answers
  • VerbaNexAI at SemEval-2025 Task 2: Enhancing Entity-Aware Translation with Wikidata-Enriched MarianMT
  • CSECU-Learners at SemEval-2025 Task 9: Enhancing Transformer Model for Explainable Food Hazard Detection in Text
  • AILS-NTUA at SemEval-2025 Task 3: Leveraging Large Language Models and Translation Strategies for Multilingual Hallucination Detection
  • UPC-HLE at SemEval-2025 Task 7: Multilingual Fact-Checked Claim Retrieval with Text Embedding Models and Cross-Encoder Re-Ranking
  • CSECU-Learners at SemEval-2025 Task 11: Multilingual Emotion Recognition and Intensity Prediction with Language-tuned Transformers and Multi-sample Dropout
  • Amado at SemEval-2025 Task 11: Multi-label Emotion Detection in Amharic and English Data
  • NarrativeNexus at SemEval-2025 Task 10: Entity Framing and Narrative Extraction using BART
  • DEMON at SemEval-2025 Task 10: Fine-tuning LLaMA-3 for Multilingual Entity Framing
  • JNLP at SemEval-2025 Task 1: Multimodal Idiomaticity Representation with Large Language Models
  • HU at SemEval-2025 Task 9: Leveraging LLM-Based Data Augmentation for Class Imbalance
  • NarrativeMiners at SemEval-2025 Task 10: Combating Manipulative Narratives in Online News
  • Habib University at SemEval-2025 Task 11: Bridging the Gap in Text-Based Emotion Detection
  • WC Team at SemEval-2025 Task 6: PromiseEval: Multinational, Multilingual, Multi-Industry Promise Verification leveraging monolingual and multilingual BERT models
  • CSIRO LT at SemEval-2025 Task 8: Answering Questions over Tabular Data using LLMs
  • Oath Breakers at SemEval-2025 Task 06: Leveraging DeBERTa and Contrastive Learning for Promise Verification
  • CLaC at SemEval-2025 Task 6: A Multi-Architecture Approach for Corporate Environmental Promise Verification
  • YNU-HPCC at SemEval-2025 Task 6: Using BERT Model with R-drop for Promise Verification
  • PATeam at SemEval-2025 Task 9: LLM-Augmented Fusion for AI-Driven Food Safety Hazard Detection
  • CSCU at SemEval-2025 Task 6: Enhancing Promise Verification with Paraphrase and Synthesis Augmentation: Effects on Model Performance
  • UCSC at SemEval-2025 Task 3: Context, Models and Prompt Optimization for Automated Hallucination Detection in LLM Output
  • YNU-HPCC at SemEval-2025 Task 10: A Two-Stage Approach to Solving Multi-Label and Multi-Class Role Classification Based on DeBERTa
  • YNU at SemEval-2025 Task 4: Synthetic Token Alternative Training for LLM Unlearning
  • JU-CSE-NLP’25 at SemEval-2025 Task 4: Learning to Unlearn LLMs
  • pingan-team at SemEval-2025 Task 2: LoRA-Augmented Qwen2.5 with Wikidata-Driven Entity Translation
  • YNU-HPCC at SemEval-2025 Task 1: Enhancing Multimodal Idiomaticity Representation via LoRA and Hybrid Loss Optimization
  • JU_NLP at SemEval-2025 Task 7: Leveraging Transformer-Based Models for Multilingual & Crosslingual Fact-Checked Claim Retrieval
  • UCSC NLP T6 at SemEval-2025 Task 1: Leveraging LLMs and VLMs for Idiomatic Understanding
  • JUNLP_Sarika at SemEval-2025 Task 11: Bridging Contextual Gaps in Text-Based Emotion Detection using Transformer Models
  • YNUzwt at SemEval-2025 Task 10: Tree-guided Stagewise Classifier for Entity Framing and Narrative Classification
  • TECHSSN at SemEval-2025 Task 10: A Comparative Analysis of Transformer Models for Dominant Narrative-Based News Summarization
  • KyuHyunChoi at SemEval-2025 Task 10: Narrative Extraction Using a Summarization-Specific Pretrained Model
  • fact check AI at SemEval-2025 Task 7: Multilingual and Crosslingual Fact-checked Claim Retrieval
  • SHA256 at SemEval-2025 Task 4: Selective Amnesia – Constrained Unlearning for Large Language Models via Knowledge Isolation
  • Team ACK at SemEval-2025 Task 2: Beyond Word-for-Word Machine Translation for English-Korean Pairs
  • Jim at SemEval-2025 Task 5: Multilingual BERT Ensemble
  • Annif at SemEval-2025 Task 5: Traditional XMTC augmented by LLMs
  • RUC Team at SemEval-2025 Task 5: Fast Automated Subject Indexing via Similar Records Matching and Related Subject Ranking

10:30-11:00 Coffee break

11:00-12:00 Oral Session IV: Best System papers presentation - Best system awards will be notified in advance and announced publicly at the conference.

12:00-14:00 Lunch break

14:00-15:30 Poster Session III: System Description Papers (in person only)
  • Tuebingen at SemEval-2025 Task 10: Class Weighting, External Knowledge and Data Augmentation in BERT Models
  • NYCU-NLP at SemEval-2025 Task 11: Assembling Small Language Models for Multilabel Emotion Detection and Intensity Prediction
  • MALTO at SemEval-2025 Task 3: Detecting Hallucinations in LLMs via Uncertainty Quantification and Larger Model Validation
  • Habib University at SemEval-2025 Task 9: Using Ensemble Models for Food Hazard Detection
  • iShumei-Chinchunmei at SemEval-2025 Task 4: A balanced forgetting and retention multi-task framework using effective unlearning loss
  • Atyaephyra at SemEval-2025 Task 4: Low-Rank Negative Preference Optimization
  • COGNAC at SemEval-2025 Task 10: Multi-level Narrative Classification with Summarization and Hierarchical Prompting
  • SheffieldGATE at SemEval-2025 Task 2: Multi-Stage Reasoning with Knowledge Fusion for Entity Translation
  • Fossils at SemEval-2025 Task 9: Tasting Loss Functions for Food Hazard Detection in Text Reports
  • Ustnlp16 at SemEval-2025 Task 9: Improving Model Performance through Imbalance Handling and Focal Loss
  • GIL-IIMAS UNAM at SemEval-2025 Task 4: LA-Min(E): LLM Unlearning Approaches Under Function Minimizing Evaluation Constraints
  • CIC-IPN at SemEval-2025 Task 11: Transformer-Based Approach to Multi-Class Emotion Detection
  • Mr. Snuffleupagus at SemEval-2025 Task 4: Unlearning Factual Knowledge from LLMs Using Adaptive RMU
  • NarrativeMiners at SemEval-2025 Task 10: Combating Manipulative Narratives in Online News
  • TUM-MiKaNi at SemEval-2025 Task 3: Towards Multilingual and Knowledge-Aware Non-factual Hallucination Identification
  • UAlberta at SemEval-2025 Task 2: Prompting and Ensembling for Entity-Aware Translation
  • Oath Breakers at SemEval-2025 Task 06: Leveraging DeBERTa and Contrastive Learning for Promise Verification
  • MALTO at SemEval-2025 Task 4: Dual Teachers for Unlearning Sensitive Content in LLMs
  • TueCL at SemEval-2025 Task 1: AdMIRe: Advancing Multimodal Idiomaticity Representation
  • Wikidata-Driven Entity-Aware Translation: Boosting LLMs with External Knowledge
  • UCSC at SemEval-2025 Task 3: Context, Models and Prompt Optimization for Automated Hallucination Detection in LLM Output
  • COGUMELO at SemEval-2025 Task 3: A Synthetic Approach to Detecting Hallucinations in Language Models based on Named Entity Recognition
  • FactDebug at SemEval-2025 Task 7: Hybrid Retrieval Pipeline for Identifying Previously Fact-Checked Claims Across Multiple Languages
  • HiTZ-Ixa at SemEval-2025 Task 1: Multimodal Idiomatic Language Understanding
  • AIMA at SemEval-2025 Task 1: Bridging Vision and Language Modalities for Idiomatic Knowledge Extraction via Mixture of Experts
  • TIFIN India at SemEval-2025: Harnessing Translation to Overcome Multilingual IR Challenges in Fact-Checked Claim Retrieval
  • AKCIT at SemEval-2025 Task 11: Investigating Data Quality in Portuguese Emotion Recognition
  • RAGthoven at SemEval 2025 - Task 2: Enhancing Entity-Aware Machine Translation with Large Language Models, Retrieval Augmented Generation and Function Calling
  • LA²I²F at SemEval-2025 Task 5: Reasoning in Embedding Space – Fusing Analogical and Ontology-based Reasoning for Document Subject Tagging
  • CAIDAS at SemEval-2025 Task 7: Enriching Sparse Datasets with LLM-Generated Content for Improved Information Retrieval
  • Swushroomsia at SemEval-2025 Task 3: Probing LLMs' Collective Intelligence for Multilingual Hallucination Detection
  • Tewodros at SemEval-2025 Task 11: Multilingual Emotion Intensity Detection using Small Language Models

15:30-16:00 Coffee break

16:00-16:30 Concluding Remarks and Introducing new tasks