Hello there! I am Lixuan, a third-year CS Undergraduate student at Xidian University. Currently, I am a Research Intern
at Stony Brook University, advised by Chenyu You and work closely with Yifei Wang and
Stefanie Jegelka to explore Sparsity's efficient and scalable applications in various fields, such as
embedding-based retrieval, LLM finetuning and MoE architecture design. Previously, I cooperate with Shixiong Zhang
on projects focusing on single-cell-RNA sequence clustering.
My current research interest broadly lies in Computer Vision, Natural Language Processing and Machine Learning.
I am dedicated to the development of a universal, efficient and reliable machine learning system that can handle images, text, and multimodal tasks.
News
[10/2025] CSRv2 is out. Let's explore ultra sparsity together!
In the era of large foundation models, the quality of embeddings has become a central determinant of downstream task performance and overall system capability.
Yet widely used dense embeddings are often extremely high-dimensional (e.g., 4096), incurring substantial costs in storage, memory, and inference latency.
To address these, Contrastive Sparse Representation (CSR) is recently proposed as a promising direction, mapping dense embeddings into high-dimensional but $k$-sparse vectors, in contrast to compact dense embeddings such as Matryoshka Representation Learning (MRL).
Despite its promise, CSR suffers severe degradation in the ultra-sparse regime (e.g., $k \leq 4$), where over 80\% of neurons remain inactive, leaving much of its efficiency potential unrealized.
In this paper, we introduce \textbf{CSRv2}, a principled training approach designed to make ultra-sparse embeddings viable.
CSRv2 stabilizes sparsity learning through progressive $k$-annealing, enhances representational quality via supervised contrastive objectives, and ensures end-to-end adaptability with full backbone finetuning.
CSRv2 reduces dead neurons from 80\% to 20\% and delivers a 14\% accuracy gain at $k=2$, bringing ultra-sparse embeddings on par with CSR at $k=8$ and MRL at 32 dimensions, \textit{all with only two active features}.
While maintaining comparable performance, CSRv2 delivers a {7$\times$ speedup over MRL}, and yields up to \textbf{300$\times$ improvements in compute and memory efficiency} relative to dense embeddings.
Extensive experiments across text (MTEB, multiple state-of-the-art LLM embeddings (Qwen and e5-Mistral-7B)) and vision (ImageNet-1k) demonstrate that CSRv2 makes ultra-sparse embeddings practical without compromising performance.
By making extreme sparsity viable, CSRv2 broadens the design space for large-scale, real-time, and edge-deployable AI systems where both embedding quality and efficiency are critical.
Cell clustering is crucial for analyzing single-cell RNA sequencing (scRNA-seq) data, allowing us to identify and
differentiate various cell types and uncover their similarities and differences. Despite its importance, clustering
scRNA-seq data poses significant challenges due to its high dimensionality, sparsity, dropout events caused by
sequencing limitations, and complex noise patterns. To address these issues, we introduce a new clustering method
called single-cell ZINB-based Graph Contrast Learning (scZGCL). This method employs an unsupervised deep embedding
clustering algorithm. During the pre-training phase, our method utilizes an autoencoder based on the Zero-Inflated
Negative Binomial Distribution (ZINB) model, learns cell relationship weights through a graph-attentive neural
network, and introduces contrast learning to bring similar cells closer together. In the fine-tuning phase, scZGCL
refines the clustering results by optimizing a loss function using Kullback-Leibler (KL) divergence, enhancing the
accuracy of cell classification into distinct clusters. Comprehensive experiments on 12 scRNA-seq datasets
demonstrate that scZGCL outperforms state-of-the-art clustering methods.
Professional Activity
Journal Reviewer: Pattern Recognition, TNNLS
Selected Honors & Awards
[05/2025] International Collegiate Programming Contest (ICPC) Provincial Bronze Award
[02/2025] US College Mathematics Modeling Competition (MCM) Finalist Award (Top 1%).
[11/2024] National Mathematical Competition for College Students Provincial First Prize.
[07/2024] Awarded Second-Class Undergraduate Scholarship in Xidian University.
[02/2024] US College Interdisciplinary Contest in Modeling (ICM) Meritorious Award.