Publications
$^*$ denotes equal contribution. (α-β) denotes alphabetical order.
2026
- arXivTransformers Are Born Biased: Structural Inductive Biases at Random Initialization and Their Practical ConsequencesarXiv preprint, 2026Part of the methodology, SeedPrint, is published in ICLR 2026.
- ICLRSeedPrints: Fingerprints Can Even Tell Which Seed Your Large Language Model Was Trained FromInternational Conference on Learning Representations, 2026Abridged in the NeurIPS 2025 Lock-LLM Workshop: Prevent Unauthorized Knowledge Use from Large Language Models.
- ICLRPrefixMemory-Tuning: Modernizing Prefix-Tuning by Decoupling the Prefix from AttentionInternational Conference on Learning Representations, 2026Abridged in the ICML 2025 Workshop on Test-Time Adaptation: Putting Updates to the Test (PUT).
- ICLRVariational Autoencoding Discrete Diffusion with Enhanced Dimensional Correlations ModelingInternational Conference on Learning Representations, 2026
- ICLRReinforcing Diffusion Models by Direct Group Preference OptimizationInternational Conference on Learning Representations, 2026
2025
- NeurIPS WorkshopSRTD: A Symmetric Divergence for Interpretable Comparison of Representation TopologyNeurIPS Workshop on Symmetry and Geometry in Neural Representations, 2025
- NeurIPSNoise Consistency Training: A Native Approach for One-step Generator in Learning Additional ControlsAdvances in Neural Information Processing Systems, 2025
- NeurIPSReward-Instruct: A Reward-Centric Approach to Fast Photo-Realistic Image GenerationAdvances in Neural Information Processing Systems, 2025
- JMLRMinimax Optimal Deep Neural Network Classifiers Under Smooth Decision BoundaryJournal of Machine Learning Research, 26 (136), 2025
- ICCVAdding Additional Control to One-Step Diffusion with Joint Distribution MatchingProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025
- ICCVLearning Few-Step Diffusion Models by Trajectory Distribution MatchingProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025
- ICML Workshop
Oral Any-Order GPT as Masked Diffusion Model: Decoupling Formulation and ArchitectureICML Workshop on Efficient Systems for Foundation Models (ES-FoMo III) (Selected for Oral presentation)., 2025 - ICMLElucidating the design space of language models for image generationInternational Conference on Machine Learning, 2025
- ICLRYou only sample once: Taming one-step text-to-image synthesis by self-cooperative diffusion gansInternational Conference on Learning Representations, 2025
- TMLRAlgoFormer: An Efficient Transformer Framework with Algorithmic StructuresTransactions on Machine Learning Research, 2025
- EMNLPThe Buffer Mechanism for Multi-Step Information Reasoning in Language ModelsFindings of the Association for Computational Linguistics: EMNLP, 2025
- NAACLGetting More Juice Out of Your Data: Hard Pair Refinement Enhances Visual-Language Models Without Extra DataAnnual Conference of the North American Chapter of the Association for Computational Linguistics, 2025
2024
- NeurIPS WorkshopIn-Context Learning behaves as a greedy layer-wise gradient descent algorithmNeurIPS Workshop on Adaptive Foundation Models: Evolving AI for Personalized and Efficient Learning, 2024
- ICMLExact Conversion of In-Context Learning to Model Weights in Linearized-Attention TransformersInternational Conference on Machine Learning, 2024
- ICMLThe Surprising Effectiveness of Skip-Tuning in Diffusion SamplingInternational Conference on Machine Learning, 2024
- ICMLReferee Can Play: An Alternative Approach to Conditional Generation via Model InversionInternational Conference on Machine Learning, 2024
- CVPRAccelerating Diffusion Sampling with Optimized Time StepsProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
- ECCVJointDreamer: Ensuring Geometry Consistency and Text Congruence in Text-to-3D Generation via Joint Score DistillationEuropean Conference on Computer Vision, 2024
- ICLRElucidating The Design Space of Classifier-Guided Diffusion GenerationInternational Conference on Learning Representations, 2024
- JMLRRandom Smoothing Regularization in Kernel Gradient Descent LearningJournal of Machine Learning Research, 25 (284), 2024
- IJCAIDeciphering the Projection Head: Representation Evaluation Self-supervised LearningInternational Joint Conference on Artificial Intelligence, 2024
2023
- NeurIPS
Spotlight Complexity Matters: Rethinking the Latent Space for Generative ModelingAdvances in Neural Information Processing Systems, 2023 - NeurIPSDiff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion ModelsAdvances in Neural Information Processing Systems, 2023
- arXiv
- CVPRContraNeRF: Generalizable Neural Radiance Fields for Synthetic-to-real Novel View Synthesis via Contrastive LearningIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
- AISTATSInducing Neural Collapse in Deep Long-tailed LearningIn International Conference on Artificial Intelligence and Statistics, 2023
- ICMLExplore and Exploit the Diverse Knowledge in Model Zoo for Domain GeneralizationIn International Conference on Machine Learning, 2023
- UAIExact Count of Boundary Pieces of ReLU Classifiers: Towards the Proper Complexity Measure for ClassificationIn Conference on Uncertainty in Artificial Intelligence, 2023
- ICLRYour Contrastive Learning Is Secretly Doing Stochastic Neighbor EmbeddingInternational Conference on Learning Representations, 2023
- TMLRContinual Learning by Modeling Intra-Class VariationTransactions on Machine Learning Research, 2023
2022
- NeurIPSZooD: Exploiting Model Zoo for Out-of-Distribution GeneralizationAdvances in Neural Information Processing Systems, 2022
- NeurIPS
Spotlight Understanding Square Loss in Training Overparametrized Neural Network ClassifiersAdvances in Neural Information Processing Systems, 2022
2021
- AISTATSRegularization Matters: A Nonparametric Perspective on Overparametrized Neural NetworkIn International Conference on Artificial Intelligence and Statistics, 2021
2020
- Thesis
- arXivSharp Rate of Convergence for Deep Neural Network Classifiers Under the Teacher-student SettingarXiv preprint, 2020
- AJPEInter-rater Reliability of Web-based Calibrated Peer Review within a Pharmacy CurriculumAmerican Journal of Pharmaceutical Education, 2020
2018
- arXiv