11:00 am – 12:15 pm CSE Colloquium – Evaluating AI Agents in the Real World: Lessons from Two Benchmarks Free