Beyond Memorization: Assessing Semantic Generalization in Large Language Models Using Phrasal Constructions

Published in AACL, 2025

We present a novel NLI dataset specifically designed to test LM knowledge of constructional semantics. We find that SOTA LLMs succeed at the task when presented with common constructional exemplars, though changing the hypotheses to target less salient aspects of constructional semantics greatly harms performance.