SEEM
The Chinese University of Hongkong, Hongkong
Email: dcyang@se.cuhk.edu.hk
Phone: +8615087581161
I am a first year PhD student at The Chinese University of Hongkong, majoring in Speech and Audio Processing, Supervised by Prof. Helen Meng. Before that, I received the Master's Degree from Peking University in 2023.
My research focus on developing a human-agent that can communicate with human,
e.g. understooding human's speech and environments sound, and then producing feedback to humans.
Note: I am actively looking for any collaboration opportunities (e.g. Audio Foundation Models, Generative Models, TTS, Text-to-audio...). Please feel free to contact me.
Audio Foundation Models, Generative Models, Large Language Models, Audio/Speech Processing
July 2023 - Sep. 2023
MSRA, Speech Group, Intern.
Supervisor: Xu Tan,
May 2021 - May 2023
Tencent AI Lab, Speech Group, Intern.
Supervisor: Songxiang Liu, Chao Weng, and Bo Wu
Dongchao Yang, Jianwei Yu, Helin Wang, Wen Wang, Chao Weng, Yuexian Zou, Dong Yu
Diffsound: Discrete Diffusion Model for Text-to-sound generation
Accepted by IEEE Transactions on Audio, Speech and Language Processing., 2023.
[Code]
Dongchao Yang, Helin Wang, Yuexian Zou
Unsupervised Multi-Target Domain Adaptation for Acoustic Scene Classification
Proc. Interspeech, 2021.
[Code]
Dongchao Yang, Helin Wang, Yuexian Zou, Zhongjie Ye, WenWu Wang
A MUTUAL LEARNING FRAMEWORK FOR FEW-SHOT SOUND EVENT DETECTION
ICASSP, 2022.
[Code]
Rongjie Huang*, Jiawei Huang*, Dongchao Yang*, Yi Ren, et al.
Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models
Accepted by ICML., 2023.
[Code]
Dongchao Yang, Songxiang Liu, Helin Wang, Jianwei Yu, Chao Weng, Yuexian Zou
NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS
Interspeech , 2023.
[Code]
Dongchao Yang, Helin Wang, Yuexian Zou, WenWu Wang
A Mixed Supervised Learning Framework for Target Sound Detection
DCASE Workshop (Oral representation), 2022.
[Code]
Rongjie Huang*, Mingzhe Li*, Dongchao Yang*, Jiatong Shi, et all
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Accepted by AAAI 2024 Demo , 2023.
[Code]
Dongchao Yang*, Songxiang Liu*, Rongjie Huang, Chao Weng, Helen Meng
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt
Preprints , 2023.
[Code]