| | SE Basics | |
| 26-Aug-24 | Intro and Course Details | Saikat |
| 28-Aug-24 | Program Analysis 1 | Saikat |
| 2-Sep-24 | Labor Day NO CLASS | |
| 4-Sep-24 | Program Analysis 2 | Saikat |
| 9-Sep-24 | Software Testing | Saikat |
| 11-Sep-24 | Debugging | Saikat |
| | LLM Basics | |
| 16-Sep-24 | ML Models: Intro | Saikat |
| 18-Sep-24 | LLMs for Code (CodeBert/T5/CodeLlama) | Claas |
| | Primary: CodeBERT: A Pre-Trained Model for Programming and Natural Languages | |
| | Secondary: CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation | |
| | GraphCodeBERT: Pre-training Code Representations with Data Flow | |
| 23-Sep-24 | Fine-Tuning | Yuhao |
| | LoRA: Low-Rank Adaptation of Large Language Models | |
| | QLoRA: Efficient Finetuning of Quantized LLMs | |
| 25-Sep-24 | Evaluating LLMs | David |
| | Primary: Evaluating Large Language Models Trained on Code | |
| | Secondary: Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation | |
| | ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation | |
| | CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution | |
| 30-Sep-24 | Proposal Presentations | |
| | ML4SE | |
| 2-Oct-24 | Fuzzing with LLMs | Elaine |
| | Primary: Codamosa: Escaping coverage plateaus in test generation with pre-trained large language models | |
| | Secondary: Automated Unit Test Improvement using Large Language Models at Meta | |
| | {FuzzGuard}: Filtering out unreachable inputs in directed grey-box fuzzing through deep learning | |
| | Can Large Language Models Write Good Property-Based Tests? | |
| 7-Oct-24 | Program Analysis with LLMs | Kassandra |
| | Primary: Enhancing Static Analysis for Practical Bug Detection: An LLM-Integrated Approach | |
| | Secondary: Large Language Models for Code Analysis: Do LLMs Really Do Their Job? | |
| 9-Oct-24 | Program Repair with LLMs and Agents | Arnav |
| | Primary: AutoCodeRover: Autonomous Program Improvement | |
| | Secondary: AGENTLESS: Demystifying LLM-based Software Engineering Agents | |
| | Swe-bench: Can language models resolve real-world github issues? | |
| 14-Oct-24 | Fall break: No class | |
| 16-Oct-24 | Verification | Aditya |
| | Primary: Baldur: Whole-proof generation and repair with large language models | |
| | Secondary: Lemur: Integrating large language models in automated program verification | |
| 21-Oct-24 | Security | Ethan |
| | Primary: Large language models for code: Security hardening and adversarial testing | |
| | Secondary: NoFunEval: Funny How Code LMs Falter on Requirements Beyond Functional Correctness | |
| 23-Oct-24 | Guest Lecture: Prof. Pengyu Nie (University of Waterloo) | |
| | Primary: Learning Deep Semantics for Test Completion | |
| | Secondary: Generating Exceptional Behavior Tests with Reasoning Augmented Large Language Models | |
| | SE4ML | |
| 28-Oct-24 | Test Oracle Generation | Pengyue |
| | Primary: Toga: A neural method for test oracle generation | |
| | Secondary: On learning meaningful assert statements for unit test cases | |
| 30-Oct-24 | Code Generation | Kevin Cui |
| | Primary: Monitor-guided decoding of code LMs with static analysis of repository context | |
| | SynCode: LLM Generation with Grammar Augmentation | |
| | Secondary: Codeplan: Repository-level coding using llms and planning | |
| 4-Nov-24 | Detecting Numerical Errors | Kevin Guan |
| | Primary: Reliability Assurance for Deep Neural Network Architectures Against Numerical Defects | |
| | Secondary: Discovering Discrepancies in Numerical Libraries | |
| | Exposing numerical bugs in deep learning via gradient back-propagation | |
| 6-Nov-24 | Testing DL Systems | Ibrahim |
| | Deepxplore: Automated whitebox testing of deep learning systems | |
| | Dlfuzz: Differential fuzzing testing of deep learning systems | |
| 11-Nov-24 | Guest Lecture: Chenyuan Yang (University of Illinois Urbana-Champaign) | |
| | Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models | |
| | WhiteFox: White-box Compiler Fuzzing Empowered by Large Language Models | |
| 13-Nov-24 | Testing DNNs | Shinhae |
| | Primary: Deephunter: a coverage-guided fuzz testing framework for deep neural networks | |
| | Secondary: Tensorfuzz: Debugging neural networks with coverage-guided fuzzing | |
| | Concolic testing for deep neural networks | |
| | Symbolic execution for attribution and attack synthesis in neural networks | |
| 18-Nov-24 | DNN model versioning and management | Ika |
| | Git-theta: A git extension for collaborative development of machine learning models | |
| | MGit: A Model Versioning and Management System | |
| 20-Nov-24 | MLOps | Mudit |
| | Primary: Hidden technical debt in machine learning systems | |
| | Secondary: “We Have No Idea How Models will Behave in Production until Production”: How Engineers Operationalize Machine Learning | |
| | An Analysis of MLOps Architectures: A Systematic Mapping Study | |
| 25-Nov-24 | Testing DL Libraries | Charles |
| | Primary: Deep Learning Library Testing via Effective Model Generation | |
| | Secondary: NeuRI: Diversifying DNN Generation via Inductive Rule Inference | |
| | Docter: Documentation-guided fuzzing for testing deep learning api functions | |
| | DLLens: Testing Deep Learning Libraries via LLM-aided Synthesis | |
| 27-Nov-24 | Thanksgiving break NO CLASS | |
| 2-Dec-24 | Project Presentations | |
| 4-Dec-24 | Project Presentations | |
| 9-Dec-24 | Project Presentations | |