Publications
, ViTex: Visual Texture Control for Multi-Track Symbolic Music Generation via Discrete Diffusion Models, in Proceedings of the 51st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2026. [demo] [code]
, When Noise Lowers the Loss: Rethinking Likelihood-Based Evaluation in Music Large Language Models, in Proceedings of the 51st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2026. [demo] [code]
, Evaluating High-Resolution Piano Sustain Pedal Depth Estimation with Musically Informed Metrics, in Proceedings of the 51st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2026. [code]
, Revisiting Music Encoding for Music-to-Text Large Language Models: What Is Encoded and What Is “Heard”, in Proceedings of Music Encoding Conference (MEI), 2026.
, BOSSA: Learning Music Style Through Cross-Modal Bootstrapping, in NeurIPS 2025 Workshop AI4Music, 2025. [demo]
, Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization, in Proceedings of the 39th Conference on Neural Information Processing Systems (NeurIPS), 2025. [demo] [code]
, TOMI: Transforming and Organizing Music Ideas for Multi-Track Compositions with Full-Song Structure, in Proceedings of the 26th International Society for Music Information Retrieval Conference (ISMIR), 2025. [demo] [code]
, Automatic Melody Reduction via Shortest Path Finding, in Proceedings of the 26th International Society for Music Information Retrieval Conference (ISMIR), 2025. [demo] [code]
, High-Resolution Sustain Pedal Depth Estimation from Piano Audio Across Room Acoustics, in Proceedings of the 26th International Society for Music Information Retrieval Conference (ISMIR), 2025. [code]
, Towards Human-Like Music Intelligence via Concept Alignment , PhD Thesis, New York University, 2025.
, Efficient Fine-Grained Guidance for Diffusion Model Based Symbolic Music Generation, in Proceedings of the 42nd International Conference on on Machine Learning (ICML), 2025. [demo] [code]
, Unsupervised Disentanglement of Content and Style via Variance-Invariance Constraints, in Proceedings of the 13th International Conference on Learning Representations (ICLR), 2025. [demo] [code]
, Do Large Language Models Perceive Orderly Number Concepts as Humans?, in ICLR 2025 Workshop on Representation Alignment (Re-Align), 2025. [code]
, Whole-Song Hierarchical Generation of Symbolic Music Using Cascaded Diffusion Models, in Proceedings of the 12th International Conference on Learning Representations (ICLR), spotlight presentation, 2024. [demo] [code]
, Exploring GPT's Ability as a Judge in Music Understanding, in Proceedings of the 25th International Society for Music Information Retrieval Conference (ISMIR), 2024. [code]
, Structured Multi-Track Accompaniment Arrangement via Style Prior Modelling, in 38th Conference on Neural Information Processing Systems (NeurIPS), 2024. [demo] [code]
, ChatMusician: Understanding and Generating Music Intrinsically with LLM, in Findings of the Association for Computational Linguistics (ACL), 2024. [demo] [code]
, Foundation Models for Music: A Survey, in arXiv preprint, arXiv:2408.14340v2 [cs.SD], 2024.
, Controllable Music Inpainting With Mixed-level and Disentangled Representation, in Proceedings of 48th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023. [demo] [code]
, Audio-To-Symbolic Arrangement Via Cross-Modal Music Representation Learning, in Proceedings of 47th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022. [music1] [music2] [talk] [code]
, Modeling Perceptual Loudness of Piano Tone: Theory and Applications, in Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR), 2022. [code]
, MuseBERT: Pre-training of Music Representation for Music Understanding and Controllable Generation, in Proceedings of the 22nd International Society for Music Information Retrieval Conference (ISMIR), 2021. [code]
, Learning Interpretable Representation for Controllable Polyphonic Music Generation, in Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR), 2020. [music1] [music2] [music3] [talk] [code]
, PianoTree VAE: Structured Representation Learning for Polyphonic Music, in Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR), 2020. [talk] [code]
, POP909: A Pop-Song Dataset for Music Arrangement Generation, in Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR), 2020. [code]
, BUTTER: A Representation Learning Framework for Bi-directional Music-Sentence Retrieval and Generation, in Proceedings of the 1st Workshop on NLP for Music and Audio (NLP4MusA), 2020. [code]
, Deep Music Analogy Via Latent Representation Disentanglement, in Proceedings of the 20st International Society for Music Information Retrieval Conference (ISMIR), 2019. [music] [tutorial] [code]
, Transferring Piano Performance Control Across Environments, in Proceedings of the 44th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019.
, A Framework for Automated Pop-song Melody Generation and Piano Accompaniment Arrangement, in Proceedings of the 20st International Society for Music Information Retrieval Conference (ISMIR), 2019.
* indicates equal contribution.