| SE Basics | |
25-Aug-25 | Intro and Course Details | Saikat |
27-Aug-25 | Program Analysis 1 | Saikat |
1-Sep-25 | Labor Day NO CLASS | |
3-Sep-25 | Program Analysis 2 | Saikat |
8-Sep-25 | Software Testing | Saikat |
10-Sep-25 | Debugging | Saikat |
| LLM Basics | |
15-Sep-25 | ML Models: Intro | Saikat |
17-Sep-25 | LLMs for Code (CodeBert/T5/CodeLlama) | Project Proposal Due |
| Primary: CodeBERT: A Pre-Trained Model for Programming and Natural Languages | |
| Secondary: AST-T5: Structure-Aware Pretraining for Code Generation and Understanding | |
| CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation | |
| Code Llama: Open Foundation Models for Code | |
22-Sep-25 | Post-Training LLM Adaptation | |
| Primary: Training language models to follow instructions with human feedback | |
| Secondary: Direct Preference Optimization: Your Language Model is Secretly a Reward Model | |
| SelfCodeAlign: Self-Alignment for Code Generation | |
24-Sep-25 | Fine-Tuning | |
| LoRA: Low-Rank Adaptation of Large Language Models | |
| QLoRA: Efficient Finetuning of Quantized LLMs | |
29-Sep-25 | Proposal Presentations | |
1-Oct-25 | Evaluating LLMs | |
| Primary: Evaluating Large Language Models Trained on Code | |
| Secondary: Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation | |
| ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation | |
| CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution | |
| ML4SE | |
6-Oct-25 | Fuzzing with LLMs | |
| Primary: Codamosa: Escaping coverage plateaus in test generation with pre-trained large language models | |
| Secondary: Automated Unit Test Improvement using Large Language Models at Meta | |
| {FuzzGuard}: Filtering out unreachable inputs in directed grey-box fuzzing through deep learning | |
| Can Large Language Models Write Good Property-Based Tests? | |
8-Oct-25 | Program Analysis with LLMs | |
| Primary: Enhancing Static Analysis for Practical Bug Detection: An LLM-Integrated Approach | |
| Secondary: Large Language Models for Code Analysis: Do LLMs Really Do Their Job? | |
13-Oct-25 | Program Repair with LLMs and Agents | |
| Primary: AutoCodeRover: Autonomous Program Improvement | |
| Secondary: AGENTLESS: Demystifying LLM-based Software Engineering Agents | |
| Swe-bench: Can language models resolve real-world github issues? | |
15-Oct-25 | Verification | |
| Primary: Rango: Adaptive Retrieval-Augmented Proving for Automated Software Verification | |
| Secondary: Baldur: Whole-proof generation and repair with large language models | |
20-Oct-25 | Security | |
| Primary: Large language models for code: Security hardening and adversarial testing | |
| Secondary: NoFunEval: Funny How Code LMs Falter on Requirements Beyond Functional Correctness | |
22-Oct-25 | Test Generation | MidTerm Report Due |
| Primary: Learning Deep Semantics for Test Completion | |
| Secondary: Generating Exceptional Behavior Tests with Reasoning Augmented Large Language Models | |
27-Oct-25 | Code Translation | |
| Primary: AlphaTrans: A Neuro-Symbolic Compositional Approach for Repository-Level Code Translation and Validation | |
| Secondary: Scalable, Validated Code Translation of Entire Projects using Large Language Models | |
| VERT: Verified Equivalent Rust Transpilation with Large Language Models as Few-Shot Learners | |
SE4ML | | |
29-Oct-25 | Test Oracle Generation | |
| Primary: Toga: A neural method for test oracle generation | |
| Secondary: On learning meaningful assert statements for unit test cases | |
3-Nov-25 | Code Generation | |
| Primary: Monitor-guided decoding of code LMs with static analysis of repository context | |
| SynCode: LLM Generation with Grammar Augmentation | |
| Secondary: Codeplan: Repository-level coding using llms and planning | |
5-Nov-25 | Detecting Numerical Errors | |
| Primary: Automatically Detecting Numerical Instability in Machine Learning Applications via Soft Assertions | |
| Secondary: Reliability Assurance for Deep Neural Network Architectures Against Numerical Defects | |
10-Nov-25 | Fuzzing | |
| Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models | |
| WhiteFox: White-box Compiler Fuzzing Empowered by Large Language Models | |
12-Nov-25 | Testing DNNs | |
| Primary: Deephunter: a coverage-guided fuzz testing framework for deep neural networks | |
| Secondary: Tensorfuzz: Debugging neural networks with coverage-guided fuzzing | |
| Concolic testing for deep neural networks | |
| Symbolic execution for attribution and attack synthesis in neural networks | |
17-Nov-25 | | ASE |
19-Nov-25 | | ASE |
24-Nov-25 | Testing DL Libraries | |
| Primary: Deep Learning Library Testing via Effective Model Generation | |
| Secondary: NeuRI: Diversifying DNN Generation via Inductive Rule Inference | |
| Docter: Documentation-guided fuzzing for testing deep learning api functions | |
| DLLens: Testing Deep Learning Libraries via LLM-aided Synthesis | |
26-Nov-25 | Thanksgiving break NO CLASS | |
1-Dec-25 | Project Presentations | |
3-Dec-25 | Project Presentations | |
8-Dec-25 | Project Presentations | |
15-Dec-25 | No classes | Final Report Due |