About Me
I am a Postdoctoral Researcher at the Courant Institute of Mathematical Sciences at New York University, working with Prof. Yann LeCun on world models for music. I also hold a visiting affiliation with the Computer Science Department at MBZUAI.
Prior to that, I received my PhD in Computer Science from New York University, where I worked on the intersection of music and machine learning under the guidance of Prof. Gus Xia in Music X Lab. During my PhD, I was also a visiting researcher at the Machine Learning Department at MBZUAI and affiliated with NYU Shanghai. In 2019, I earned my undergraduate degree in Mathematics from Fudan University. Beyond my academic pursuits, I am a passionate conductor, pianist, and Erhu (a traditional Chinese string instrument) player. I have previously served as the conductor of the NYU Shanghai Jazz Ensemble and as the director of the Fudan Musical Club.
Research Interest
People often imagine an emotional and human-like AI. In my research, I translate this vision into the music domain as the challenge of building machine musicianship. Music is one of the most subtle forms of human expression, full of nuanced intentions that are deeply felt yet often difficult to articulate. If a machine, rigorous by nature, were to express such intentions, what would that process look like? Could it perceive rather than process? Could it compose rather than imitate? Could it formalize what we ourselves find vague, to reveal something that makes music resonate more deeply across the world?
In response to these questions, my research explores how intelligent systems can analyze and create music in more human-like ways. I focus on the emergence of musical concepts and hierarchical structures from data, and how such representations can support automatic composition and enable interactive creation. My work spans self-supervised learning, hierarchical modeling, large language models (LLMs), representation learning, disentanglement, style transfer, computer music, and human–computer interaction (HCI).
Here is a brief overview of my research, organized by topic:
Hierarchical Music Generation & Arrangement

- Whole-song generation via compositional hierarchy
- [WholeSongGen]
- Modeling long-term context dependency
- [AccomontageIII][Mixed-level Inpainting]
- Dataset
- [POP909]
Representation Learning

- Representation Disentanglement: monophonic & polyphonic
- [EC2-VAE][Poly-Dis]
- Cross-modality representation: audio & text
- [Audio2Midi][BUTTER]
Unsupervised Concept Emergence

- Learning content & style via variability constraints
- [V3]
- Unsupervised modeling of music structure
- [PianoTree-VAE][MuseBERT]

