Loading Events

« All Events

  • This event has passed.

Harrison, D. (CS) – Multi-Level Control in Neural Dialogue Generation: Style, Semantics, and Selection through Over-Generation and Ranking

March 9 @ 2:00 pm4:00 pm

End-to-end neural generation models have largely displaced the modular architectures that once gave dialogue system designers explicit control over what is said and how it is said. While these models produce fluent text, they collapse content planning, sentence planning, and surface realization into a single undifferentiated decoding step, sacrificing the controllable structure that earlier systems provided. This dissertation investigates how that structure can be recovered through the over-generate-and-rank (OGR) paradigm: generating multiple candidate outputs and selecting among them using learned or prompt-based ranking functions that jointly optimize semantic fidelity, stylistic appropriateness, and conversational coherence. We instantiate OGR at three levels of natural language generation for dialogue: utterance-level stylistic control, cross-domain semantic evaluation, and dialogue-level response selection.

First, we show that explicit conditioning mechanisms, specifically decoder-level side constraints for personality variation and discourse contrast, re-introduce stylistic control into neural sequence-to-sequence models without compromising semantic accuracy. Second, we demonstrate that prompt-based learning with structured linguistic profiles achieves near-perfect personality accuracy and effectively zero slot error rate when combined with ranking, establishing that LLM prompting with explicit pragmatic specifications can match or exceed fine-tuning for personality-conditioned generation. Third, we develop a cross-domain semantic error rate evaluation framework that frames slot error computation as an extraction task, using a LoRA-adapted language model to extract meaning representations from generated text and a trained ranker to select among candidate extractions, achieving reliable evaluation across 23 topic domains without domain-specific rules. Fourth, we build and evaluate a speaker-aware transformer response ranker for Athena, our Alexa Prize socialbot, demonstrating that learned ranking over heterogeneous generator pools produces significantly longer conversations and higher user ratings than heuristic rule-based selection in a live A/B study with over 6,000 conversations.

A unifying finding emerges across all four contributions: the pragmatic features that control personality style in generation—acknowledgements, engagement questions, hedges, exclamations—are the same features that distinguish high-quality from mediocre responses in open-domain dialogue. This parallel reveals that stylistic control and response ranking are complementary mechanisms for achieving the same goal: making dialogue systems sound more natural and engaging. Together, these results support the dissertation’s central hypothesis that over-generate-and-rank provides a general, extensible mechanism for controllable neural language generation, restoring explicit decision points where competing communicative objectives can be weighed. The ranking function serves a role analogous to the sentence planner in classical NLG architectures, but operates on the outputs of modern neural and LLM-based generators.

 

Event Host: Davan Harrison, Ph.D. Candidate, Computer Science

Advisor: Marilyn Walker

 

Details

Other

Room Number
TBD