Guangyue Xu

I am currently a Senior Data Scientist at Search@Target. Previously, I pursued a Ph.D. at Michigan State University, where I focused on Machine Learning (ML) and Natural Language Processing (NLP). My research centers on pre-training large vision-language models and enhancing their generalization capabilities, with applications in e-commerce search.

I received my B.E. in Software Engineering from Jilin University and M.S. in Computer Science from Tsinghua University. I also interned in MSRA's Web Search and Mining Group.

News
Posts
Beyond Text: How We Built Multimodal Retrieval for E-Commerce Search
Mar 10, 2026  ยท  multimodal e-commerce
Most e-commerce search systems rely purely on text โ€” but product images carry signals that text often misses. We propose a two-stage vision-language alignment framework achieving +5% Recall@100 over text-only baselines.
Multi-Channel Retrieval Fusion: From Heuristic Weights to Unified Ranking
May 2, 2025  ยท  Search Learning-to-Rank
E-commerce search pulls candidates from text, semantic, and behavioral channels โ€” then merges them with hand-tuned weights. We replace that with a unified LTR model, deployed at Target.com with +2.85% conversion lift.
View all posts โ†’
Selected Publications
Multimodal Retrieval
Guangyue Xu*, Qujiaheng Zhang*, Fengjie Li  (* co-first authors)
arXiv preprint arXiv:2603.04836, 2026

We study unified text-image fusion for two-tower retrieval models in e-commerce, proposing a novel modality fusion network that captures cross-modal complementary information between product text and images.

Multi-Channel Retrieval
Guangyue Xu*, A. Gaydhani*, D. Kamath, A. Singh, A. Li  (* co-first authors)
arXiv preprint arXiv:2602.23530, 2026

A unified learning-to-rank framework that jointly optimizes ranking across multiple retrieval channels in large-scale e-commerce search, improving relevance and efficiency.

GIPCOL
Guangyue Xu, Parisa Kordjamshid, Joyce Chai
WACV, 2024

We introduce a GNN into soft-prompting design to improve CLIP's compositional zero-shot learning ability.

MetaReVision
Guangyue Xu, Parisa Kordjamshid, Joyce Chai
EMNLP-Findings, 2023

We meta-train vision-language models using retrieved items to obtain more generalizable token representations and improve compositional ability.

Prompting
Guangyue Xu, Parisa Kordjamshid, Joyce Chai
arXiv, 2022

We systematically investigate various prompting techniques for CLIP in compositional zero-shot learning.

Full list on Google Scholar โ†’
Service
Program Committee / Reviewer: EACL, EMNLP, ACM MM, ACL, WACV, LREC, ICLR, NeurIPS, ...