Jump to content

Jingyi Jessica Li

From Wikipedia, the free encyclopedia
Jingyi Jessica Li
李婧翌
Born1985
Alma materTsinghua University (B.S.)
University of California, Berkeley (Ph.D.)
Known for
  • Statistical methods for RNA sequencing
  • Bioinformatics tools for single-cell transcriptomics
  • Quantifying the central dogma using statistics
  • P-value-free false discovery rate control
  • Neyman-Pearson classification for medical diagnostics
Awards
Scientific career
Fields
Institutions
Thesis Statistical Methods for Analyzing High-throughput Biological Data  (2013)
Doctoral advisorsPeter J. Bickel
Haiyan Huang
Websitejsb.ucla.edu

Jingyi Jessica Li (Chinese:李婧翌) is a professor of Statistics, Biostatistics, Human genetics, Computational medicine, and Bioinformatics at the University of California, Los Angeles. Her research integrates statistical principles with biological data analysis, particularly in genomics and transcriptomics.

Li has won several awards, including the Overton Prize[5] from the International Society for Computational Biology and the Emerging Leader Award[6] from COPSS. In 2025, she was appointed to a Guggenheim Fellowship.[7]

Education and career

[edit]

Li started her undergraduate education at Tsinghua University in 2003. She moved to the University of California, Berkeley for her Ph.D., and then started as a faculty member at the University of California, Los Angeles in 2013.[5] As of 2025 she is a full professor.[8]

From 2022 to 2023, she was a Radcliffe Fellow at the Harvard Radcliffe Institute for Advanced Study and a visiting professor in the Department of Statistics at Harvard University.[9]

Research

[edit]

Her work relates to transcription and translational control of protein expression levels in the central dogma and statistical methods for RNA-seq data at the bulk and single-cell levels.

Her 2011 Nature study suggested that transcription, rather than translation, remains the dominant factor regulating protein abundance, primarily influencing differences in protein expression levels across genes.[10]

Her research group developed a suite of single-cell data simulators, including scDesign,[11] scDesign2 that captures gene-gene correlations,[12] scDesign3 for single-cell and spatial multi-omics data,[13] and scReadSim for single-cell RNA-seq and ATAC-seq read simulation.[14] Besides, her group developed scImpute,[15] an imputation tool for missing gene expression values.

Her contributions also extend to statistical and computational methodologies, including Clipper,[16] a p-value-free false discovery rate (FDR) control method; gR2, which generalizes the Pearson correlation squares to capture complex linear dependencies in bivariate data;[17] ITCA, a criterion for guiding the combination of ambiguous class labels in multiclass classification;[18] and Neyman-Pearson classification, a framework for prioritizing the control of misclassification errors in critical classes.[19][20]

Her recent efforts advocate for the importance of statistical rigor in genomics data analysis. In a recent study, she and co-authors raised a warning in using popular RNA-seq differential expression (DE) methods blindly without checking the underlying assumptions. For example, in population-scale human RNA-seq samples where the negative binomial assumption for each gene does not hold, popular methods relying on this assumption can lead to excessive false discoveries, while non-parametric tests such as the Wilcoxon rank-sum test gives more reliable results.[21] Moreover, she developed scDEED,[22] a statistical method leveraging permutation techniques to evaluate and optimize embeddings produced by t-SNE and UMAP. scDEED detects dubious embeddings that fail to preserve mid-range distances and refines t-SNE and UMAP hyperparameters. She also proposed leveraging semi-synthetic negative control data to detect and eliminate false discoveries resulting from analysis biases, such as double dipping. An example is her method, ClusterDE,[23] a statistical approach designed to identify post-clustering DE genes as reliable markers of cell types and spatial domains in single-cell and spatial transcriptomic data analysis while ensuring false discovery rate control regardless of clustering quality.

Awards and Honors

[edit]

Li has received a number of awards for her contributions to statistics and computational biology, including:

References

[edit]
  1. ^ a b "Johnson & Johnson names six winners of first Women in STEM2D Scholars Award". JNJ.com. Johnson & Johnson. 12 April 2018. Retrieved 2025-06-03.
  2. ^ a b "Grant Detail: University of California, Los Angeles – Sloan Research Fellowship (FG-2018-10529)". Alfred P. Sloan Foundation Grants Database. Alfred P. Sloan Foundation. 2018. Retrieved 2025-06-03.
  3. ^ a b "NSF CAREER Award – Jingyi Jessica Li". National Science Foundation (NSF). Retrieved 2025-02-03.
  4. ^ a b "Jingyi Jessica Li | Innovators Under 35". Innovators Under 35. Retrieved 4 June 2025.
  5. ^ a b c d Fogg, Christiana N.; Kovats, Diane E.; Vingron, Martin (30 June 2023). "2023 ISCB Overton Prize: Jingyi Jessica Li". Bioinformatics. 39 (Supplement 1): i5 – i6. doi:10.1093/bioinformatics/btad307. Retrieved 2025-06-03.
  6. ^ a b c "Meet the 2023 COPSS Emerging Leader Awardees". Institute of Mathematical Statistics. 31 March 2023. Retrieved 2025-06-03.
  7. ^ a b c "Announcing the 2025 Guggenheim Fellows — Guggenheim Fellowships: Supporting Artists, Scholars, & Scientists". Guggenheim Foundation. 15 April 2025. Retrieved 4 June 2025.
  8. ^ "Jingyi Jessica Li – UCLA Graduate Programs in Bioscience". Bioscience.UCLA.edu. University of California, Los Angeles. Retrieved 2025-06-03.
  9. ^ "Jingyi Jessica Li". Radcliffe Institute for Advanced Study at Harvard University. Retrieved 4 June 2025.
  10. ^ Li, Jingyi Jessica; Biggin, Mark D. (2015). "Statistics requantitates the central dogma". Science. 347 (6226): 1066–1067. Bibcode:2015Sci...347.1066L. doi:10.1126/science.aaa8332. OSTI 1353301. PMID 25745146. Retrieved 2025-02-03.
  11. ^ Li, Wei Vivian; Li, Jingyi Jessica (2019). "A statistical simulator scDesign for rational scRNA-seq experimental design". Bioinformatics. 35 (14). Oxford University Press: i41 – i50. doi:10.1093/bioinformatics/btz390. PMC 7755417. PMID 33351929.
  12. ^ Sun, Tianyi; Song, Dongyuan; Li, Wei Vivian; Li, Jingyi Jessica (2021). "scDesign2: a transparent simulator that generates high-fidelity single-cell gene expression count data with gene correlations captured". Genome Biology. 22 (1). BioMed Central: 163. doi:10.1186/s13059-021-02367-2. PMC 8144190. PMID 34044808.
  13. ^ Song, Dongyuan; Wang, Qingyang; Yan, Guanao; Liu, Tianyang; Sun, Tianyi; Li, Jingyi Jessica (2024). "scDesign3 generates realistic in silico data for multimodal single-cell and spatial omics". Nature Biotechnology. 42 (2). Nature Publishing Group: 247–252. doi:10.1038/s41587-023-01772-1. PMC 11182337. PMID 37169966.
  14. ^ Yan, Guanao; Song, Dongyuan; Li, Jingyi Jessica (November 18, 2023). "scReadSim: a single-cell RNA-seq and ATAC-seq read simulator". Nature Communications. 14 (1): 7482. Bibcode:2023NatCo..14.7482Y. doi:10.1038/s41467-023-43162-w. PMC 10657386. PMID 37980428.
  15. ^ Li, Wei Vivian; Li, Jingyi Jessica (2018). "An accurate and robust imputation method scImpute for single-cell RNA-seq data". Nature Communications. 9 (1): 997. Bibcode:2018NatCo...9..997L. doi:10.1038/s41467-018-03405-7. PMC 5843666. PMID 29520097.,
  16. ^ Ge, Xinzhou; Chen, Yiling Elaine; Song, Dongyuan; McDermott, MeiLu; Woyshner, Kyla; Manousopoulou, Antigoni; Wang, Ning; Li, Wei; Wang, Leo D.; Li, Jingyi Jessica (2021). "Clipper: p-value-free FDR control on high-throughput data from two conditions". Genome Biology. 22 (1): 288. doi:10.1186/s13059-021-02506-9. PMC 8504070. PMID 34635147.
  17. ^ Li, Jingyi Jessica; Tong, Xin; Bickel, Peter J. (2018). "Generalized Pearson correlation squares for capturing mixtures of bivariate linear dependences". arXiv:1811.09965 [stat.ME].
  18. ^ Zhang, Qi; Zhang, Yu; Li, Jingyi Jessica (2023). "itca: an information-theoretic criterion for label aggregation in multi-class classification". Bioinformatics. 40 (1): 1246–1249. doi:10.1093/bioinformatics/btad770. PMID 37930802.
  19. ^ Tong, Xin; Feng, Yang; Li, Jingyi Jessica (2018). "Neyman-Pearson classification algorithms and NP receiver operating characteristics". Science Advances. 4 (2). American Association for the Advancement of Science: eaao1659. arXiv:1608.03109. Bibcode:2018SciA....4.1659T. doi:10.1126/sciadv.aao1659. PMC 5804623. PMID 29423442.
  20. ^ Zhang, Mingwei; Li, Jingyi Jessica (2023). "Hierarchical Neyman–Pearson classification for high-stakes decision making". Journal of the American Statistical Association. doi:10.1080/01621459.2023.2270657. Retrieved 2025-02-03.
  21. ^ Li, Yumei; Ge, Xinzhou; Peng, Fanglue; Li, Wei; Li, Jingyi Jessica (2022). "Exaggerated false positives by popular differential expression methods when analyzing human population samples". Genome Biology. 23 (1): 216. doi:10.1186/s13059-022-02648-4. PMC 8922736. PMID 35292087.
  22. ^ Xia, L.; Lee, C.; Li, J. J. (2024). "Statistical method scDEED for detecting dubious 2D single-cell embeddings". Nature Communications. 15 (1): 1753. doi:10.1038/s41467-024-45891-y. PMC 10897166. PMID 38409103.
  23. ^ Song, Dongyuan; Chen, Siqi; Lee, Christy; Li, Kexin; Ge, Xinzhou; Li, Jingyi Jessica (2023-07-21). "Synthetic control removes spurious discoveries from double dipping in single-cell and spatial transcriptomics data analyses" (PDF). bioRxiv. doi:10.1101/2023.07.21.550107. PMC 10401959. PMID 37546812.
[edit]