Publications

Publications

  1. JMLR
    Large sample spectral analysis of graph-based multi-manifold clustering
    Nicolas Garcia Trillos*, Pengfei He*, and Chenghui Li*
    Journal of Machine Learning Research (JMLR), 2023
  2. SIGKDD Explor.
    DiffusionShield: A Watermark for Data Copyright Protection against Generative Diffusion Models
    Yingqian Cui, Jie Ren, Han Xu, and 5 more authors
    SIGKDD Explor. Newsl., Jan 2025
  3. TMLR
    Stealthy Backdoor Attack via Confidence-driven Sampling
    Pengfei He, Yue Xing, Han Xu, and 6 more authors
    Transactions on Machine Learning Research, Jan 2025
  4. SIGKDD Explor.
    FT-Shield: A Watermark Against Unauthorized Fine-tuning in Text-to-Image Diffusion Models
    Yingqian Cui, Jie Ren, Yuping Lin, and 7 more authors
    SIGKDD Explor. Newsl., Jan 2025
  5. Stat
    Towards the Effect of Examples on In-Context Learning: A Theoretical Case Study
    Pengfei He, Yingqian Cui, Han Xu, and 4 more authors
    Stat, Jan 2024
  6. preprint
    Towards Context-Robust LLMs: A Gated Representation Fine-tuning Approach
    Shenglai Zeng, Pengfei He, Kai Guo, and 4 more authors
    arXiv preprint arXiv:2502.14100, Jan 2025
  1. CIKM
    PROPN: Personalized Probabilistic Strategic Parameter Optimization in Recommendations
    Pengfei He, Haochen Liu, Xiangyu Zhao, and 2 more authors
    In Proceedings of the 31st ACM International Conference on Information & Knowledge Management (CIKM), 2022
  2. ICML
    Probabilistic Categorical Adversarial Attack and Adversarial Training
    Han Xu, Pengfei He, Jie Ren, and 4 more authors
    In International Conference on Machine Learning (ICML), 2023
  3. ICLR Spotlight
    Sharpness-Aware Data Poisoning Attack
    Pengfei He, Han Xu, Jie Ren, and 4 more authors
    In International Conference on Learning Representations (ICLR), 2024
    Spotlight Paper, 5%
  4. ACL
    The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)
    Shenglai Zeng, Jiankun Zhang, Pengfei He, and 8 more authors
    In Findings of the Association for Computational Linguistics ACL 2024, Aug 2024
  5. ACL
    Exploring Memorization in Fine-tuned Language Models
    Shenglai Zeng, Yaxin Li, Jie Ren, and 7 more authors
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Aug 2024
  6. EMNLP
    On the Generalization of Training-based ChatGPT Detection Methods
    Han Xu, Jie Ren, Pengfei He, and 5 more authors
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Dec 2024
  7. EMNLP
    Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis
    Yuping* Lin, Pengfei* He, Han Xu, and 4 more authors
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Dec 2024
  8. NAACL
    Data Poisoning for In-context Learning
    Pengfei He, Han Xu, Yue Xing, and 3 more authors
    In Proceedings of the 2025 Conference of the North American Chapter of the Association for Computational Linguistics, Apr 2025
  9. AISTATS
    Superiority of Multi-Head Attention in In-Context Linear Regression
    Yingqian Cui, Jie Ren, Pengfei He, and 2 more authors
    In , Apr 2025
  10. AISTSTS
    A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration
    Yingqian Cui, Pengfei He, Xianfeng Tang, and 4 more authors
    In , Apr 2025

Preprints

  1. preprint
    Copyright Protection in Generative AI: A Technical Perspective
    Jie Ren, Han Xu, Pengfei He, and 8 more authors
    2024
  2. preprint
    Make LLMs better zero-shot reasoners: Structure-orientated autonomous reasoning
    Pengfei He, Zitao Li, Yue Xing, and 3 more authors
    2024
  3. preprint
    Mitigating the privacy issues in retrieval-augmented generation (rag) via pure synthetic data
    Shenglai Zeng, Jiankun Zhang, Pengfei He, and 7 more authors
    2024
  4. preprint
    Multi-Faceted Studies on Data Poisoning can Advance LLM Development
    Pengfei He, Yue Xing, Han Xu, and 2 more authors
    2025
  5. preprint
    Red-Teaming LLM Multi-Agent Systems via Communication Attacks
    Pengfei He, Yupin Lin, Shen Dong, and 3 more authors
    2025
  6. preprint
    Unveiling Privacy Risks in LLM Agent Memory
    Bo Wang, Weiyi He, Pengfei He, and 4 more authors
    2025
  7. preprint
    Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models
    Yingqian Cui, Pengfei He, Jingying Zeng, and 8 more authors
    2025