CS 333: Algorithms for Interactive Robotics

Winter 2022, Class: Mon, Wed 3:15-4:45PM, 380-380W

Description:

As the field of robotics and AI is quickly emerging, one critical and challenging subject is ensuring that these agents can co-exist, coordinate, collaborate, and interact with humans. This course covers a diverse set of topics that include learning from (suboptimal) human data, coordination in repeated interactions, influencing interactions, intent inference, and shared autonomy. There will also be several guest lectures from experts in the field. Students will practice essential research skills including critiquing papers, debating, reviewing, writing project proposals, and presenting ideas effectively. The course is open to graduate and undergraduate students.

Format:

The course is a combination of lecture and reading sessions. The lectures discuss the fundamentals of topics required for modeling and design of interactive agents. During the reading sessions, students present and discuss recent contributions in this area. Throughout the semester, each student works on a related research project that they present at the end of the semester. See detailed course policies.

Prerequisites:

CS 221/229, CS 237a, and CS 238/CS 234 are recommended, but not required.

Learning Objectives:

At the end of this course you will have gained knowledge about applications of various topics in designing interactive autonomous systems.

You will also have hands-on experience working on a research project and it is expected that you will gain the following research skills: analyzing literature related to a particular topic, critiquing papers, and presentation of research ideas.

Announcements:

COVID updates: The first two weeks of class (Jan 3 - Jan 14) will be virtual. To attend class and office hours, please use the zoom links provided on canvas.
Readings: Please sign up for 2 presenter slots (1 pro and 1 con) and 1 respondent slot using the link provided on canvas. Please sign up for a slot by Wednesday, Janurary 5th.
All lecture notes will be posted under the Files section on canvas.
Long reviews will now be due at noon the day the paper is presented instead of at midnight before the class.
Please turn in your project proposal, milestones, and final report by 11:59PM the day it is due on canvas.
On Wednesday (2/2) Dorsa will be in room 380-380W for lecture. Anyone that is interested in joining in person should feel free to join :) Class will still be available virtually.
Late policy: we will deduct 25% of the total points for late assignments. Please remember to turn assignments in on time!

Staff

Dorsa Sadigh

Instructor

Office Hours: Mon 5-6pm by appointment

Location: Gates 246

Webpage

Minae Kwon

Course Assistant

Office Hours: Wed 5-7pm

Location: Gates 212

Virtual Location: Zoom link

Webpage

Timeline

Date	Lecture	Handouts / Deadlines	Notes
Week 1 Mon, Jan 03	Lecture Introduction Learning from Demonstrations (1)	Please checkout our Course Policies. Please sign up for 2 presenter slots and 1 respondent slot using the link on canvas by Wednesday Jan. 5th. Efficient Reductions for Imitation Learning. Ross & Bagnell. (2010) A Reduction from Apprenticeship Learning to Classification. Syed & Schapire. (2010) A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning. Ross et al. (2011) Search-based Structured Prediction. Daume III et al. (2009)	Sample Long Review
Week 1 Wed, Jan 05	Lecture Learning from Demonstrations (2)	Maximum Entropy IRL. Ziebart, et al. (2010) Maximum Margin Planning. Ratliff et al. (2006) Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality. Zhang and Cao. (2021)
Week 2 Mon, Jan 10	Quiz+Reading Learning from Suboptimal Demonstrations, Offline RL	P1: Better-than-demonstrator Imitation Learning via Automatically-ranked Demonstrations. Brown, et al. (2020) P2: Should I Run Offline Reinforcement Learning or Behavioral Cloning? Conservative Q-Learning for Offline Reinforcement Learning. Kumar, et al. (2020) Learning from Suboptimal Demonstration via Self-Supervised Reward Regression. Chen et al. (2020)
Week 2 Wed, Jan 12	Lecture Learning from Preferences	Erdem Biyik, Stanford Active Preference-Based Learning of Reward Functions. Sadigh, et al. (2017) Asking Easy Questions: A User-Friendly Approach to Active Reward Learning. Biyik et al. (2019) Learning Reward Functions by Integrating Human Demonstrations and Preferences. Palan et al. (2019) Learning Multimodal Rewards from Rankings. Myers et al. (2021)
Week 3 Mon, Jan 17	Martin Luther King, Jr., Day (No class)
Week 3 Wed, Jan 19	Quiz+Reading Learning from Non-traditional Sources of Data	P1: Concept2Robot: Learning Manipulation Concepts from Instructions and Human Demonstrations. Shao, et al. (2020) P2: Learning Generalizable Robotic Reward Functions from “In-The-Wild” Human Videos. Chen, et al. (2021) Learning Robot Objectives from Physical Human Interaction. Bajcsy & Losey, et al. (2018) Learning Human Objectives from Sequences of Physical Corrections. Li et al. (2021)
Week 4 Mon, Jan 24	Lecture Experimental Design	Due Project Proposal Sample Consent Form Sample Debrief Form Guide to crowdsourcing experiments
Week 4 Wed, Jan 26	Quiz+Reading Learning from Suboptimal Humans	P1: Inverse Reward Design. Hadfield-Menell et al. (2017) P2: Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior. Reddy et al. (2018) Prospect theory: An analysis of decision under risk. Kahneman and Tversky. (1979) Theories of Bounded Rationality. Simon. (1972)
Week 5 Mon, Jan 31	Guest Lecture Human-Robot Interaction in Practice	Mengyuan Yan, Mohi Khansari, Karol Hausman, Everyday Robots, Google Brain
Week 5 Wed, Feb 02	Lecture Multi-Agent Interactions: Game-Theoretic and Representation Learning Techniques	Information Gathering Actions over Human Internal State. Sadigh et al. (2016) Planning for Autonomous Cars that Leverage Effects on Human Actions. Sadigh et al. (2016) Planning for Cars that Coordinate with People: Leveraging Effects on Human Actions for Planning and Active Information Gathering over Human Internal State. Sadigh et al. (2018) Learning from My Partner's Actions: Roles in Decentralized Robot Teams. Losey & Li et al. (2019) Learning with Opponent-Learning Awareness. Foerster et al. (2017)
Week 6 Mon, Feb 07	Quiz+Reading Multi-Agent Interactions: Game-Theoretic and Representation Learning Techniques	P1: QMIX. Rashid et al. (2018) P2: Cooperative Inverse Reinforcement Learning. Hadfield-Menell, et al. (2016) Learning with Opponent Learning Awareness. Foerster et al. (2018)
Week 6 Wed, Feb 09	Lecture Repeated Interactions and Games	On the Critical Role of Conventions in Adaptive Human-AI Collaboration. Shih et al. (2021) Formalizing Human-Robot Mutual Adaptation: A Bounded Memory Model. Nikolaidis, et al. (2014) Emergent Prosociality in Multi-Agent Games Through Gifting. Wang et al. (2021) Incentivizing Efficient Equilibria in Traffic Networks with Mixed Autonomy. Biyik et al. (2021)
Week 7 Mon, Feb 14	Quiz+Reading Multiagent Coordination	Due Project Milestone Review P1: Learning to Communicate with Deep Multi-Agent Reinforcement Learning. Foerester et al. (2016) P2: Off-Belief Learning. Hu et al. (2021) On the Utility of Learning about Humans for Human-AI Coordination. Carroll et al. (2020) Modeling Strong and Human-Like Gameplay with KL-Regularized Search. Jacob and Wu et al. (2021)
Week 7 Wed, Feb 16	Guest Lecture Multi-Agent Interactions	Jakob Foerster, University of Oxford
Week 8 Mon, Feb 21	Preseidents' Day (No class)
Week 8 Wed, Feb 23	Lecture Shared Autonomy	A Policy Blending Formalism for Shared Control. Dragan et al. (2013) Shared Autonomy via Hindsight Optimization. Srinivasa et al. (2016) Shared Autonomy with Learned Latent Actions. Jeon et al. (2020)
Week 9 Mon, Feb 28	Quiz+Reading Shared Autonomy	P1: Shared Autonomy via Deep Reinforcement Learning. Reddy, et al. (2018) P2: LILA: Language-Informed Latent Actions. Karamcheti and Srivastava et al. (2021) Learning to share autonomy across repeated interaction. Jonnavittula and Losey. (2021)
Week 9 Wed, Mar 02	Guest Lecture The Role of Language in Building Interactive Robotics Agents	Thomas Kollar, Toyota Research Institute
Week 10 Mon, Mar 07	Presentation	Due Project Presentation
Week 10 Wed, Mar 09	Presentation	Due Project Presentation
Week 11 Thu, Mar 17	Project Reports	Due Deadline at midnight (Firm)

Grading Metrics

Component	Contribution to Grade
Student Presentations & Paper Reviews	40%
Final Project	40%
Quizzes & Class Participation	20%
Total	100%

Final Project Grading

Component	Contribution to Grade
Project Proposal Reports	5%
Project Milestone Reviews	5%
Project Presentation (Possibly with Demo)	10%
Final Project Report	20%
Total	40%

Grading Policies

Student Presentation & Paper Reviews (40%): All students will get a chance to present during the reading days. Each paper will have two presenters discussing the pros or cons of the paper. The presenters need to send the reviews (conference style) of their reading assignments by the noon on the day the paper is presented. The presentation grade is based on how well the material is presented in both the written review and the talk, how well it is connected to the rest of the papers or class, and how prepared the student is in answering questions from the class.

We will allocate 35 minutes for each paper (10 minutes for the paper's introduction, 10 minutes for pros, 10 minutes for cons, and 5 minutes for discussion). Both pro and con presenters should work together on the introduction. Student respondents should ask their prepared question during the discussion.

Every other student who will not be presenting is still required to write a short review of the two papers presented in reading days by noon on the day the paper is presented. The short reviews should just be a couple of sentences summarizing each of the two papers.

Final Project (40%): Each student is required to work individually or in groups of up to three people on a research project. The project requires a 1-page proposal including the relevant literature survey, a 2-page milestone review, a 5-8 page final report, and a final presentation/demo. All the page limits exclude references. Students who are taking the class for 4 units are required to work individually on their projects.

Quizzes & Class Participation (20%): There will be 6 short quizzes during the reading days on the lecture material and some of the paper readings. Students can drop 2 of the quizzes. We will also ask students to sign up as respondents and ask questions about the papers being presented. Respondents will be graded on their questions asked during the discussion period of each paper. All students should try to participate in the discussions on each paper during the reading days.

Project Instructions

The research project throughout the class should be a literature survey or study a new research problem, i.e., design a new algorithm, study a new application, etc. The main deliverables of the project are:

Project Proposal Reports (5%): A 1-page proposal that has identified the problem definition, a literature survey on the problem, a potential solution, and a timeline. Please turn it in by 11:59PM the day it is due on Canvas.

Project Milestone Reviews (5%): A 2-page writeup that goes through the progress so far, if there needs to be any changes to the goals, and the updated timeline. You should schedule a meeting with me or the CA to go over the milestone reviews. Please turn it in by 11:59PM the day it is due on Canvas.

Project Presentation, Possibly with Demo (10%): A short (~15-min) presentation reporting the final findings of the project.

Final Project Report (20%): A 5-8 page project report. Please turn it in by 11:59PM the day it is due on Canvas.

Late Policy

We will deduct 25% of the total points for all late assignments (including reviews, project proposals, project milestones, and final project report.

This class is partially based on the following existing courses:
Algorithmic Human-Robot Interaction (Berkeley)
Cooperative Machines (MIT)
Human-Robot Interaction (Georgia Tech)
Computational Human-Robot Interaction (USC)