Call for Poster Submission
This conference does not accept submissions or publish proceedings. We
welcome participants to share their research results through poster presentations
(60 minutes). Please send your title, abstract, and personal information to the
conference organizing committee’s email (wangxiaobao@tju.edu.cn) by
September 8, 2024. Acceptance notifications will be sent via email.
We welcome displays on various topics related to speech information processing. The poster
should not exceed the size of 118cm x 84cm. Boards will be provided at the venue,
and you can simply put up your poster upon arrival. The format is flexible, but
the content must be in English.
Poster Venue Location

Poster Venue Location
We are pleased to share the details of the upcoming poster session, which will be divided into two groups:
1. Group 1 (Poster Numbers 1-15)
• Presentation Time: September 24, 15:50 - 16:50 (Session 4A)
2. Group 2 (Poster Numbers 16-30)
• Presentation Time: September 24, 16:50 - 17:50 (Session 4B)
The corresponding poster boards will be labeled with your assigned numbers.
The currently poster
No. | Name | Affiliation | Topic |
---|---|---|---|
1 | Zekun YANG | Nagoya University | Multi-Modal Video Summarization Based on Two-Stage Fusion of Audio, Visual, and Recognized Text Information |
2 | Fengji LI | Nagoya University & Beihang University | Mandarin Speech Reconstruction from Ultrasound Tongue Images based on Generative Adversarial Networks |
3 | Shaowen CHEN | Nagoya University | QHM-GAN: Neural Vocoder based on Quasi-Harmonic Modeling |
4 | Xiaohan Shi | Nagoya University | Speech emotion prediction towards development of emotion-aware dialogue systems |
5 | Jingyi FENG | Nagoya University | Robustness of TTS Models Trained on Noisy Transcriptions |
6 | Rui Wang | Nagoya University | Direction-aware target speaker extraction under noisy underdetermined conditions |
7 | Hao Shi | Kyoto University | Speech Enhancement using spectrogram feature fusion for noise robust speech recognition |
8 | Yahui Fu | Kyoto University | Dialogue Comprehension and Personalization for Empathetic Response Generation |
9 | Mewlude Nijat | Xinjiang University | UY/CH-CHILD -- A Public Chinese L2 Speech Database of Uyghur Children |
10 | Yikang Wang | University of Yamanashi | A Study of Guided Masking Data Augmentation for Deepfake Speech Detection |
11 | Yu-Fei Shi | The University of Science and Technology of China | Pitch-and-Spectrum-Aware Singing Quality Assessment with Bias Correction and Model Fusion |
12 | Yuta Kamiya | Shizuoka University | The construction and comparative analysis of a nationwide regional dialect language model and identification model using the Corpus of Japanese Dialects. |
13 | Haopeng Geng | The University of Tokyo | A Pilot Study of Applying Sequence-to-Sequence Voice Conversion to Evaluate the Intelligibility of L2 Speech Using a Native Speaker’s Shadowings |
14 | Nobuaki Minematsu | The University of Tokyo | Measurement of listening behaviors of learners and raters and its application for aural/oral L2 training |
15 | Rui Wang | University of Science and Technology of China | Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding |
16 | Chenda li | Shanghai Jiao Tong University | Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement |
17 | MIKAWA Tamon, FUJII Yasuhisa, WAKABAYASHI Yukoh, OHTA Kengo, NISHIMURA Ryota, KITAOKA Norihide | Toyohashi University of Technology | Listener's Head Motion Generation Responding to User's Speech and Head Movements Absract |
18 | Keigo Hojo, Yukoh Wakabayashi, Kengo Ohta, Atsunori Ogawa, Norihide Kitaoka | Toyohashi University of Technology | Improving the performance of CTC-based ASR using attention-based CTC loss |
19 | Tatsunari Takagi, Yukoh Wakabayashi, Atsunori Ogawa, Norihide Kitaoka | Toyohashi University of Technology | Text-only Domain Adaptation for CTC-based Speech Recognition through Substitution of Implicit Linguistic Information in the Search Space |
20 | Wei Wang | Shanghai Jiao Tong University | Advancing Non-intrusive Suppression on Enhancement Distortion for Noise Robust ASR |
21 | Jiaming Zhou | Nankai University | kNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels |
22 | Xu Zhang | Beijing University of Technology | Iteratively Refined Multi-Channel Speech Separation |
23 | Xue Yang | Beijing University of Technology | Coarse-to-Fine Target Speaker Extraction Based on Contextual Information Exploitation |
24 | Yun Liu | National Institute of Informatics, Tokyo, Japan | Improving curriculum learning for target speaker extraction with synthetic speakers |
25 | Zelin Qiu | The Institute of Acoustics of the Chinese Academy of Sciences | Exploring Auditory Attention Decoding using Speaker Features The auditory attention decoding (AAD) approach aims to determine the identity of the attended talker in a multi-talker scenario using neuro recordings. |
26 | Hui Wang | Human Language Technologies Lab, Nankai University | Advancing MOS Prediction Systems: Enhancing Accuracy, Robustness, and Interpretability |
27 | Hongcheng Zhang | Tianjin University | Efficient Singular Spectrum Mode Ensemble for Extracting Wide-Band Components in Overlapping Spectral Environments |
28 | Yuqin Lin | Tianjin University | Enhancing Multi-Accent Automated Speech Recognition with Accent-Activated Adapters |
29 | Cheng Gong | Tianjin University | An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios |
30 | Sun Binbin, Geng Tianqi, Feng Hui | Tianjin University | A Cross-linguistic Comparison Study on the Prosodic Encoding of Focus and Question Intonation: In the Case of Tianjin Mandarin and American English |