Li, TianyiTianyiLiHosseini, Mohammad JavadMohammad JavadHosseiniWeber, SabineSabineWeber0000-0002-5577-3356Steedman, MarkMarkSteedman2024-08-122024-08-122022https://fis.uni-bamberg.de/handle/uniba/97222We examine LMs’ competence of directional predicate entailments by supervised fine-tuning with prompts. Our analysis shows that contrary to their apparent success on standard NLI, LMs show limited ability to learn such directional inference; moreover, existing datasets fail to test directionality, and/or are infested by artefacts that can be learnt as proxy for entailments, yielding over-optimistic results. In response, we present BoOQA (Boolean Open QA), a robust multi-lingual evaluation benchmark for directional predicate entailments, extrinsic to existing training sets. On BoOQA, we establish baselines and show evidence of existing LM-prompting models being incompetent directional entailment learners, in contrast to entailment graphs, however limited by sparsity.engLanguage ModelsLanguage Models Are Poor Learners of Directional Inferenceconferenceobject10.18653/v1/2022.findings-emnlp.64