Ehsan Shareghi

Notes and Working Papers

A Short Tutorial on Variational Auto-Encoders
Ehsan Shareghi

GTS: Inference-Time Scaling of Latent Reasoning with a Learnable Gaussian Thought Sampler
Minghan Wang, Ye Bai, Thuy-Trang Vu, Ehsan Shareghi, Gholamreza Haffari
arxiv, 2026

TRIDENT: Benchmarking LLM Safety in Finance, Medicine, and Law
Zheng Hui, Yijiang River Dong, Ehsan Shareghi, Nigel Collier
arxiv, 2025

The Compressor-Retriever Architecture for Language Model OS
Yuan Yang, Siheng Xiong, Ehsan Shareghi, Faramarz Fekri
arxiv, 2024

Selected Papers

Uncertainty-Based Methods for Automated Process Reward Data Construction and Output Aggregation in Mathematical Reasoning
Jiuzhou Han, Wray Buntine, Ehsan Shareghi
Annual AAAI Conference on Artificial Intelligence (AAAI-26 Main Technical Track), 2026

Logical Reasoning with Outcome Reward Models for Test-Time Scaling
Ramya Keerthy Thatikonda, Wray Bunting, Ehsan Shareghi
Empirical Methods in Natural Language Processing (EMNLP), 2025

VerifiAgent: a Unified Verification Agent in Language Model Reasoning
Jiuzhou Han, Wray Bunting, Ehsan Shareghi
Empirical Methods in Natural Language Processing (Findings of EMNLP), 2025

Towards Uncertainty-Aware Language Agent
Jiuzhou Han, Wray Buntine, Ehsan Shareghi
Association for Computational Linguistics (Findings of ACL), 2024

FireAct: Toward Language Agent Finetuning
Baian Chen, Chang Shu, Ehsan Shareghi, Nigel Collier, Karthik R Narasimhan, Shunyu Yao
arxiv, 2023

PiVe: Prompting with Iterative Verification to Improve Graph-based Generative Capability of LLMs
Jiuzhou Han, Nigel Collier, Wray Buntine, Ehsan Shareghi
Association for Computational Linguistics (Findings of ACL), 2024

Can LLMs Reason in the Wild with Programs?
Yuan Yang, Siheng Xiong, Ali Payani, Ehsan Shareghi, Faramarz Fekri
Empirical Methods in Natural Language Processing (Findings of EMNLP), 2024

Harnessing the Power of Large Language Models for Natural Language to First-Order Logic Translation
Yuan Yang, Siheng Xiong, Ali Payani, Ehsan Shareghi, Faramarz Fekri
Association for Computational Linguistics (ACL), 2024

On the Effect of Isotropy on VAE Representations of Text
Lan Zhang, Wray Buntine, Ehsan Shareghi
Association for Computational Linguistics (ACL), 2022

Learning Sparse Sentence Encoding without Supervision
Victor Prokhorov, Yingzhen Li, Ehsan Shareghi, Nigel Collier
Workshop on Representation Learning for NLP (RepL4NLP), 2021

Compressed Nonparametric Language Modelling
Ehsan Shareghi, Reza Haffari, Trevor Cohn
International Joint Conference on Artificial Intelligence (IJCAI), 2017

Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT
Zaiqiao Meng, Fangyu Liu, Thomas Hikaru Clark, Ehsan Shareghi, Nigel Collier
Empirical Methods in Natural Language Processing (EMNLP), 2021

Self-alignment Pre-training for Biomedical Entity Representations
Fangyu Liu, Ehsan Shareghi, Zaiqiao Meng, Marco Basaldella, Nigel Collier
North American Chapter of the Association for Computational Linguistics (NAACL), 2021

Reshaping Representation Space to Balance Safety and Over-rejection in Large Audio Language Models
Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari
Empirical Methods in Natural Language Processing (EMNLP), 2025

Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models
Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari
North American Chapter of the Association for Computational Linguistics (NAACL), 2025

Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights
Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari
Empirical Methods in Natural Language Processing (EMNLP), 2024

Self-supervised Rewiring of Pre-trained Speech Encoders: Towards Faster Fine-tuning with Less Labels in Speech Processing
Hao Yang, Jinming Zhao, Reza Haffari, Ehsan Shareghi
Empirical Methods in Natural Language Processing (Findings of EMNLP), 2022

All Papers

Could language models win the International Linguistics Olympiad?
Jamie Garnham, Ehsan Shareghi
Conference on Computational Natural Language Learning (CoNLL), 2026

Privacy-R1: Privacy-Aware Multi-LLM Agent Collaboration via Reinforcement Learning
Zheng Hui, Yijiang River Dong, Sanhanat Sivapiromrat, Ehsan Shareghi, Nigel Collier
Association for Computational Linguistics (ACL), 2026

Towards Inference-time Scaling for Continuous Space Reasoning
Minghan Wang, Thuy-Trang Vu, Ehsan Shareghi, Gholamreza Haffari
Association for Computational Linguistics (Findings of ACL), 2026

Legal Citation Prediction with LLMs: A Comparative Evaluation of Instruction Tuning, Retrieval, and Jurisdiction-Specific Pre-training on the AusLaw Citation Benchmark
Jiuzhou Han, Paul Burgess, Ehsan Shareghi
Artificial Intelligence and Law, 2026

Improving Symbolic Translation of Language Models for Logical Reasoning
Ramya Keerthy Thatikonda, Jiuzhou Han, Wray Buntine, Ehsan Shareghi
AAAI workshop NeusymBridge, 2026

Logical Reasoning with Outcome Reward Models for Test-Time Scaling
Ramya Keerthy Thatikonda, Wray Bunting, Ehsan Shareghi
Empirical Methods in Natural Language Processing (EMNLP), 2025

VerifiAgent: a Unified Verification Agent in Language Model Reasoning
Jiuzhou Han, Wray Bunting, Ehsan Shareghi
Empirical Methods in Natural Language Processing (Findings of EMNLP), 2025

Graph-Based Confidence Estimation for Large Language Model Reasoning
Caiqi Zhang, Chang Shu, Ehsan Shareghi, Nigel Collier
Empirical Methods in Natural Language Processing (EMNLP), 2025

Assessing the Sensitivity and Alignment of FOL Closeness Metrics
Ramya Keerthy Thatikonda, Wray Bunting, Ehsan Shareghi
Empirical Methods in Natural Language Processing (Findings of EMNLP), 2025

Discrete Minds in a Continuous World: Do Language Models Know Time Passes?
Minghan Wang, Ye Bai, Thuy-Trang Vu, Ehsan Shareghi, Gholamreza Haffari
Empirical Methods in Natural Language Processing (Findings of EMNLP), 2025

Jigsaw Puzzles: Splitting Harmful Questions to Jailbreak Large Language Models
Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari
Conference on Language Modelling (COLM), 2025

Measuring, Evaluating and Improving Logical Consistency in Large Language Models
Yinhong Liu, Zhijiang Guo, Tianya Liang, Ehsan Shareghi, Ivan Vulić, Nigel Collier
International Conference on Machine Learning (ICML), 2025

SpeechDialogueFactory: Generating High-Quality Speech Dialogue Data to Accelerate Your Speech-LLM Development
Minghan Wang, Ye Bai, Yuxia Wang, Thuy-Trang Vu, Ehsan Shareghi, Gholamreza Haffari
International Speech Communication Association (Interspeech), 2025

Not Explainable but Verifiable: Alternative First Steps in Overcoming the Problems Associated with AI’s Answers to Legal Problems
Paul Burgess, Ehsan Shareghi
Oxford Intersections: AI in Society, 2024

Can LLMs Reason in the Wild with Programs?
Yuan Yang, Siheng Xiong, Ali Payani, Ehsan Shareghi, Faramarz Fekri
Empirical Methods in Natural Language Processing (Findings of EMNLP), 2024

Exploring the Potential of Multimodal LLM with Knowledge-Intensive Multimodal ASR
Minghan Wang, Yuxia Wang, Thuy-Trang Vu, Ehsan Shareghi, Gholamreza Haffari
Empirical Methods in Natural Language Processing (Findings of EMNLP), 2024

A Closer Look at Logical Reasoning with LLMs: The Choice of Tool Matters
Long Hei Matthew Lam, Ramya Keerthy Thatikonda, Ehsan Shareghi
The 22nd Annual Workshop of the Australasian Language Technology Association (ALTA), 2024 🏆✨ Outstanding Paper Award.

Conversational SIMULMT: Efficient Simultaneous Translation with Large Language Models
Minghan Wang, Thuy-Trang Vu, Ehsan Shareghi, Gholamreza Haffari
The 22nd Annual Workshop of the Australasian Language Technology Association (ALTA), 2024

Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators
Yinhong Liu, Han Zhou, Zhijiang Guo, Ehsan Shareghi, Ivan Vulić, Anna Korhonen, Nigel Collier
Conference on Language Modelling (COLM), 2024

Towards Uncertainty-Aware Language Agent
Jiuzhou Han, Wray Buntine, Ehsan Shareghi
Association for Computational Linguistics (Findings of ACL), 2024

Unlocking Structure Measuring: Introducing PDD, an Automatic Metric for Positional Discourse Coherence
Yinhong Liu, Yixuan Su, Ehsan Shareghi, Nigel Collier
North American Chapter of the Association for Computational Linguistics (NAACL), 2024

Equipping Language Models with Tool Use Capability for Tabular Data Analysis in Finance
Adrian Theuma, Ehsan Shareghi
European Chapter of the Association for Computational Linguistics (EACL), 2024

Reward Engineering for Generating Semi-structured Explanation
Jiuzhou Han, Wray Buntine, Ehsan Shareghi
European Chapter of the Association for Computational Linguistics (Findings of EACL), 2024

FireAct: Toward Language Agent Finetuning
Baian Chen, Chang Shu, Ehsan Shareghi, Nigel Collier, Karthik R Narasimhan, Shunyu Yao
arxiv, 2023

POSQA: Probe the World Models of LLMs with Size Comparisons
Chang Shu^*, Jiuzhou Han^*, Fangyu Liu, Ehsan Shareghi, Nigel Collier
Empirical Methods in Natural Language Processing (Findings of EMNLP), 2023

Koala: An Index for Quantifying Overlaps with Pre-training Corpora
Thuy-Trang Vu, Xuanli He, Reza Haffari, Ehsan Shareghi
Empirical Methods in Natural Language Processing (EMNLP-Demonstration), 2023

A Minimal Approach for Natural Language Action Space in Text-based Games
Dongwon Kelvin Ryu, Meng Fang, Reza Haffari, Shirui Pan, Ehsan Shareghi
Conference on Computational Natural Language Learning (CoNLL), 2023

Investigating Pre-trained Audio Encoders in the Low-Resource Condition
Hao Yang, Jinming Zhao, Reza Haffari, Ehsan Shareghi
International Speech Communication Association (Interspeech), 2023

On Reality and the Limits of Language Data: Aligning LLMs with Human Norms
Nigel Collier, Fangyu Liu, Ehsan Shareghi
Cognitive Science Society (CogSci), 2023

Generating Synthetic Speech from SpokenVocab for Speech Translation
Jinming Zhao, Reza Haffari, Ehsan Shareghi
European Chapter of the Association for Computational Linguistics (Findings of EACL), 2023

Self-supervised Graph Masking Pre-training for Graph-to-Text Generation
Jiuzhou Han, Ehsan Shareghi
Empirical Methods in Natural Language Processing (EMNLP), 2022

RedApt: An Adaptor for wav2vec 2 Encoding - Faster and Smaller Speech Translation without Quality Compromise
Jinming Zhao, Hao Yang, Reza Haffari, Ehsan Shareghi
Empirical Methods in Natural Language Processing (Findings of EMNLP), 2022

Self-supervised Rewiring of Pre-trained Speech Encoders: Towards Faster Fine-tuning with Less Labels in Speech Processing
Hao Yang^*, Jinming Zhao^*, Reza Haffari, Ehsan Shareghi
Empirical Methods in Natural Language Processing (Findings of EMNLP), 2022

Plug-and-Play Recipe Generation with Content Planning
Yinhong Liu, Yixuan Su, Ehsan Shareghi, Nigel Collier
Workshop on Generation, Evaluation & Metrics (GEM) , 2022

M-Adapter: Modality Adaptation for End-to-End Speech-to-Text Translation
Jinming Zhao, Hao Yang, Reza Haffari, Ehsan Shareghi
International Speech Communication Association (Interspeech), 2022

TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning
Yixuan Su, Fangyu Liu, Zaiqiao Meng, Tian Lan, Lei Shu, Ehsan Shareghi, Nigel Collier
North American Chapter of the Association for Computational Linguistics (Findings of NAACL), 2022

On the Effect of Isotropy on VAE Representations of Text
Lan Zhang, Wray Buntine, Ehsan Shareghi
Association for Computational Linguistics (ACL), 2022

Fire Burns, Sword Cuts: Commonsense Inductive Bias for Exploration in Text-based Games
Dongwon Kelvin Ryu, Ehsan Shareghi, Meng Fang, Yunqiu Xu, Shirui Pan, Reza Haffari
Association for Computational Linguistics (ACL), 2022

Rewire-then-Probe: A Contrastive Recipe for Probing Biomedical Knowledge of Pre-trained Language Models
Zaiqiao Meng, Fangyu Liu, Ehsan Shareghi, Yixuan Su, Charlotte Collins, Nigel Collier
Association for Computational Linguistics (ACL), 2022

It Is Not As Good As You Think! Evaluating Simultaneous Machine Translation on Interpretation Data
Jinming Zhao, Philip Arthur, Reza Haffari, Trevor Cohn, Ehsan Shareghi
Empirical Methods in Natural Language Processing (EMNLP), 2021

Unsupervised Representation Disentanglement of Text
Lan Zhang, Victor Prokhorov, Ehsan Shareghi
Workshop on Representation Learning for NLP (RepL4NLP), 2021

Learning Sparse Sentence Encoding without Supervision
Victor Prokhorov, Yingzhen Li, Ehsan Shareghi, Nigel Collier
Workshop on Representation Learning for NLP (RepL4NLP), 2021

Integrating Transformers and Knowledge Graphs for Twitter Stance Detection
Thomas Hikaru Clark, Costanza Conforti, Fangyu Liu, Zaiqiao Meng, Ehsan Shareghi, Nigel Collier
Workshop on Noisy User-generated Text (W-NUT), 2021

A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters
Mengjie Zhao, Yi Zhu, Ehsan Shareghi, Ivan Vulić, Roi Reichart, Anna Korhonen, Hinrich Schütze
Association for Computational Linguistics (ACL), 2021

Combining Deep Generative Models and Multi-lingual Pretraining for Semi-supervised Document Classification
Yi Zhu, Ehsan Shareghi, Yingzhen Li, Roi Reichart, Anna Korhonen
European Chapter of the Association for Computational Linguistics (EACL), 2021

COMETA: A Corpus for Medical Entity Linking in the Social Media
Marco Basaldella, Fangyu Liu, Ehsan Shareghi, Nigel Collier
Empirical Methods in Natural Language Processing (EMNLP), 2020

Bayesian Learning for Neural Dependency Parsing
Ehsan Shareghi, Yingzhen Li, Yi Zhu, Roi Reichart, Anna Korhonen
North American Chapter of the Association for Computational Linguistics (NAACL), 2019

A Bit of Progress and Stronger n-gram Language Modeling Baselines
Ehsan Shareghi, Daniela Gerz, Ivan Vulić, Anna Korhonen
North American Chapter of the Association for Computational Linguistics (NAACL), 2019

On the Importance of the Kullback-Leibler Divergence Term in Variational Autoencoders for Text Generation
Victor Prokhorov, Ehsan Shareghi, Yingzhen Li, ‪Mohammad Taher Pilehvar, Nigel Collier
Workshop on Neural Generation and Translation (WNGT), 2019

Compressed Nonparametric Language Modelling
Ehsan Shareghi, Reza Haffari, Trevor Cohn
International Joint Conference on Artificial Intelligence (IJCAI), 2017

Richer Interpolative Smoothing Based on Modified Kneser-Ney Language Modeling
Ehsan Shareghi, Reza Haffari, Trevor Cohn
Empirical Methods in Natural Language Processing (EMNLP), 2016

Fast, Small and Exact: Infinite-order Language Modelling with Compressed Suffix Trees
Ehsan Shareghi, Matthias Petri, Reza Haffari, Trevor Cohn
Transactions of the Association for Computational Linguistics (TACL), 2016

Compact, Efficient and Unlimited Capacity: Language Modeling with Compressed Suffix Trees
Ehsan Shareghi, Matthias Petri, Reza Haffari, Trevor Cohn
Empirical Methods in Natural Language Processing (EMNLP), 2015

Structured Prediction of Sequences and Trees using Infinite Contexts
Ehsan Shareghi, Reza Haffari, Trevor Cohn, Ann Nicholson
European Conference on Machine Learning (ECML), 2015