Yikun (Aiden) Han

I am a second-year Master's student in Data Science at the University of Michigan.

My research interests span the intersection of geometric deep learning, natural language processing, and AI for healthcare.

I am fortunate to be part of the CASI Lab at the University of Michigan, supervised by Prof. Ambuj Tewari. Additionally, I closely collaborate with the AI Health Lab at the University of Texas at Austin.

Feel free to contact me via email: yikunhan [at] umich.edu

Email  /  GitHub  /  Curriculum Vitae  /  Google Scholar  /  LinkedIn

profile photo

Publications

project image

Beyond Answers: Transferring Reasoning Capabilities to Smaller LLMs Using Multi-Teacher Knowledge Distillation


Yijun Tian*, Yikun Han*, Xiusi Chen*, Wei Wang, Nitesh V. Chawla
arXiv, 2024
arxiv / code

We present TinyLLM, a knowledge distillation approach that transfers reasoning abilities from multiple large language models (LLMs) to smaller ones. TinyLLM enables smaller models to generate both accurate answers and rationales, achieving superior performance despite a significantly reduced model size.

project image

Mapping from Meaning: Addressing the Miscalibration of Prompt-Sensitive Language Models


Kyle Cox, Jiawei Xu, Yikun Han, Abby Xu, Tianhao Li, Chi-Yang Hsu, Tianlong Chen, Walter Gerych, Ying Ding
arXiv, 2024
code

We explore prompt sensitivity in large language models (LLMs), where semantically identical prompts can yield vastly different outputs. By modeling this sensitivity as generalization error, we improve uncertainty calibration using paraphrased prompts. Additionally, we propose a new metric to quantify uncertainty caused by prompt variations, offering insights into how LLMs handle semantic continuity in natural language.

project image

When Large Language Models Meet Vector Databases: A Survey


Zhi Jing*, Yongye Su*, Yikun Han*, Bo Yuan, Haiyun Xu, Chunjiang Liu, Kehai Chen, Min Zhang
arXiv, 2024
arxiv

We survey the integration of Large Language Models (LLMs) and Vector Databases (VecDBs), highlighting VecDBs’ role in addressing LLM challenges like hallucinations, outdated knowledge, and memory inefficiencies. This review outlines foundational concepts and explores how VecDBs enhance LLM performance by efficiently managing vector data, paving the way for future advancements in data handling and knowledge extraction.

project image

A Community Detection and Graph-Neural-Network-Based Link Prediction Approach for Scientific Literature


Chunjiang Liu*, Yikun Han*, Haiyun Xu, Shihan Yang, Kaidi Wang, Yongye Su
Mathematics, 2024
paper

We integrate the Louvain community detection algorithm with various GNN models to improve link prediction in scientific literature networks. This approach consistently boosts performance, with models like GAT seeing AUC increases from 0.777 to 0.823, demonstrating the effectiveness of combining community insights with GNNs.

project image

A Comprehensive Survey on Vector Database: Storage and Retrieval Technique, Challenge


Yikun Han, Chunjiang Liu, Pengfei Wang
arXiv, 2023
arxiv

We review key algorithms for solving approximate nearest neighbor search in vector databases, categorizing them into hash-based, tree-based, graph-based, and quantization-based methods. Additionally, we discuss challenges and explore how vector databases can integrate with large language models for new opportunities.




Competitions

project image

DREAM Olfactory Mixtures Prediction Challenge


Yikun Han, Zehua Wang, Stephen Yang, Ambuj Tewari
RECOMB/ISCB Conference on Regulatory & Systems Genomics with DREAM Challenges, 2024
writeup / code / website / news

We use pre-trained graph neural networks and boosting techniques to enhance odor mixture discriminability, transforming single molecule embeddings into mixture predictions with improved robustness and accuracy.




Research Internships

project image

University of Michigan (Aug. 2023 - Now)


Advisor: Prof. Ambuj Tewari

Research Topics:

[1] Graph Neural Networks

[2] Molecular Property Prediction

[3] Protein-Ligand Affinity Prediction

project image

University of Texas at Austin (Feb. 2024 - Now)


Advisor: Prof. Ying Ding, Prof. Jiliang Tang

Research Topics:

[1] Graph Retrieval-Augmented Generation

[2] Medical AI

[3] Collaborator Recommendation

project image

University of Notre Dame (Dec. 2023 - Mar. 2024)


Advisor: Prof. Nitesh V. Chawla

Research Topics:

[1] Knowledge Distillation

[2] Multi-Teacher Collaboration

[3] In-Context Learning

project image

Tianyuan Mathematical Center in Southwest China (May. 2022 - Nov. 2022)


Advisor: Prof. Gang Chen

Research Topics:

[1] LAPACK Optimization

[2] Parallel Computation for Large-Scale Matrices

[3] High-Performance Matrix Factorization and Back Substitution




Education

project image

University of Michigan (Aug. 2023 - May. 2025)


Master

Data Science

GPA: 3.894/4.0

project image

Sichuan University (Sep. 2019 - Jun. 2023)


Bachelor

Information Resources Management

GPA: 3.87/4.0

Rank: 2/76




Awards

RSGDREAM Travel Award, 2024

Outstanding Graduate, 2023

Second Prize Scholarship 2022

Outstanding Student, 2021

Outstanding Student, 2020


Design and source code from Jon Barron's website