Guanqun Yang 杨冠群

Ph.D. Candidate in Computer Science @ Stevens Institute of Technology

Ph.D. Candidate in Computer Science
Charles V. Schaefer, Jr. School of Engineering & Science
Stevens Institute of Technology

Email: gyang16@stevens.edu or guanqun.yang@outlook.com

Office: Room 433, Gateway South Hall (Google Map)

LinkedIn | GitHub | Google Scholar | Semantic Scholar | ORCID

About Me

My research focuses on scalable software testing and analysis, with recent emphasis on testing and securing AI systems including LLMs and multi-agent architectures. I develop automated solutions that reduce manual effort in ML model validation, security vulnerability management, and evaluating AI coding assistants.

AI System Testing: Developed automated test generation frameworks (TestAug, HateModerate) that reduce manual annotation by 98% while uncovering failures in production ML models. Currently building user simulation frameworks to evaluate AI coding assistants.
Multi-Agent Systems and Security: Building and securing agentic workflows using LangGraph, MCP, and agent-to-agent protocols. Discovered prompt-injection vulnerabilities in MCP tool schemas affecting LLM agent behavior.
LLM Fine-Tuning and Alignment: Extensive experience with SFT, RLHF, and GRPO for model customization. Fine-tuned LLMs to reduce hallucination, control verbosity, and generate constrained outputs for red-teaming.
Retrieval-Augmented Generation: Developed scalable retrieval systems using vector databases (FAISS, ChromaDB) and embedding models for security patch retrieval and knowledge-intensive applications.

I am fortunate to be advised by Prof. Xueqing (Susan) Liu. Before coming to Stevens, I received my master's degree in Electrical and Computer Engineering from UCLA in 2019. I was working with Prof. Quanquan Gu and Prof. Vwani P. Roychowdhury on algorithmic fairness in machine learning systems (my master thesis). I received my Bachelor's degree in Electrical Engineering from Northeastern University, China in 2017.

Education

Stevens Institute of Technology
Ph.D. Candidate in Computer Science (2020 - 2026 Expected)

University of California, Los Angeles (UCLA)
M.S. in Electrical and Computer Engineering (2019)

Northeastern University (NEU)
(东北大学)
B.Eng. in Electrical Engineering (2017)

The High School Affiliated to Northwest Normal University
(西北师范大学附属中学)
High School Diploma (2013)

Industry Experience

Capital One, McLean, VA

Data Scientist Intern – LLM-based Legal Validation (OpenReview) (Summer 2025)
Built an explainable multi-LLM debate system to validate the legality of machine-generated marketing leads. Customized baseline models via SFT and GRPO fine-tuning on AWS EKS, and integrated RAG for context awareness, achieving over 10% improvements in accuracy and F1.

Data Scientist Intern – Hallucination and Verbosity Control for RAG-LLM Chatbots (Summer 2024)
Built a data synthesis pipeline that transforms heterogeneous call center and site content logs into QA-style SFT data. Reduced verbosity by 89.67% and hallucination by 201.68% by fine-tuning the LLM on curated synthetic SFT data.

Technical Skills

Deep Learning and LLMs

Frameworks: PyTorch, Hugging Face Transformers, DeepSpeed
LLM Training: SFT, RLHF, GRPO, parameter-efficient fine-tuning (LoRA, QLoRA)
Inference: vLLM

Agentic AI and RAG

Agent Frameworks: LangGraph, LangChain, AutoGen (AG2), Google ADK, smolagents
Protocols: MCP, A2A Protocol
Vector Databases: FAISS, ChromaDB, Milvus
Knowledge Graphs: Neo4j, GraphRAG

Infrastructure and Cloud

Cloud: AWS (EKS, EC2), GCP
MLOps: Docker, Weights & Biases
Data: ElasticSearch

Recent Activities

[2025-08] Researching security vulnerabilities in agentic LLM workflows (MCP, LangGraph).
[2025-06] Started second internship at Capital One, building LLM-based legal validation systems.
[2025-05] Preprint on scalable security patch retrieval released on arXiv.
[2024-06] Started interning at Capital One, fine-tuning LLMs for RAG chatbots.
[2024-02] Paper accepted at NAACL 2024 Findings: HateModerate (WOAH 2024 Outstanding Paper).
[2023-07] Gave a tutorial on AI and ChatGPT to underrepresented high school students. [Course Materials]