Loading Events

« All Events

Virtual Event

Wang, Y. (CSE) – Toward Practical and Effective Large Language Model Unlearning

December 8 @ 2:00 pm
Virtual Event

The growing integration of Large Language Models (LLMs) into real-world applications has heightened concerns about their trustworthiness, as models may reveal private information, reproduce copyrighted content, propagate biases, or generate harmful instructions. These risks, alongside emerging privacy regulations, motivate the need for LLM unlearning, methods that remove the influence of specific data while preserving overall model capability.
This proposal investigates how to design practical and effective unlearning methods that enable LLMs to produce reliable and responsible outputs. We study both training-free and training-based paradigms. On the training-free side, we introduce ECO, which achieves unlearning via embedding-corrupted prompts detected by a lightweight classifier, and DRAGON, a generalizable black-box framework that combines detection with chain-of-thought guard reasoning for safe in-context intervention. On the training-based side, we present FLAT, a forget-data-only loss adjustment method grounded in a variational $f$-divergence formulation.
Together, these approaches provide complementary strategies for aligning LLM behavior with safety and regulatory requirements while maintaining general utility. This proposal outlines their motivation, design, empirical performance, and the broader research plan toward responsible and accountable LLM systems.

Host: Yaxuan Wang, Ph.D. Student, Computer Science and Engineering 

Advisor: Yang Liu

Zoom- https://ucsc.zoom.us/j/94186242839?pwd=ubGMNF25W8gABNIl2S7EaIBHEXletV.1

Passcode- 786334

Details

Date:
December 8
Time:
2:00 pm – 3:00 pm
Event Category:
Last modified: Dec 05, 2025