2024 APSIPA China-Japan Joint Symposium on Speech and Language Processing

Date: September 24-25, 2024

Venue: 55B-204, College of Intelligence and Computing, Peiyangyuan Campus, Tianjin University, China



Call for Poster Submission

       This conference does not accept submissions or publish proceedings. We welcome participants to share their research results through poster presentations (60 minutes). Please send your title, abstract, and personal information to the conference organizing committee’s email (wangxiaobao@tju.edu.cn) by September 8, 2024. Acceptance notifications will be sent via email.
       We welcome displays on various topics related to speech information processing. The poster should not exceed the size of 118cm x 84cm. Boards will be provided at the venue, and you can simply put up your poster upon arrival. The format is flexible, but the content must be in English. Poster Venue Location


Poster Venue Location

We are pleased to share the details of the upcoming poster session, which will be divided into two groups:
1. Group 1 (Poster Numbers 1-15)
• Presentation Time: September 24, 15:50 - 16:50 (Session 4A)
2. Group 2 (Poster Numbers 16-30)
• Presentation Time: September 24, 16:50 - 17:50 (Session 4B)
The corresponding poster boards will be labeled with your assigned numbers.



The currently poster

No. Name Affiliation Topic
1 Zekun YANG Nagoya University Multi-Modal Video Summarization Based on Two-Stage Fusion of Audio, Visual, and Recognized Text Information
2 Fengji LI Nagoya University & Beihang University Mandarin Speech Reconstruction from Ultrasound Tongue Images based on Generative Adversarial Networks
3 Shaowen CHEN Nagoya University QHM-GAN: Neural Vocoder based on Quasi-Harmonic Modeling
4 Xiaohan Shi Nagoya University Speech emotion prediction towards development of emotion-aware dialogue systems
5 Jingyi FENG Nagoya University Robustness of TTS Models Trained on Noisy Transcriptions
6 Rui Wang Nagoya University Direction-aware target speaker extraction under noisy underdetermined conditions
7 Hao Shi Kyoto University Speech Enhancement using spectrogram feature fusion for noise robust speech recognition
8 Yahui Fu Kyoto University Dialogue Comprehension and Personalization for Empathetic Response Generation
9 Mewlude Nijat Xinjiang University UY/CH-CHILD -- A Public Chinese L2 Speech Database of Uyghur Children
10 Yikang Wang University of Yamanashi A Study of Guided Masking Data Augmentation for Deepfake Speech Detection
11 Yu-Fei Shi The University of Science and Technology of China Pitch-and-Spectrum-Aware Singing Quality Assessment with Bias Correction and Model Fusion
12 Yuta Kamiya Shizuoka University The construction and comparative analysis of a nationwide regional dialect language model and identification model using the Corpus of Japanese Dialects.
13 Haopeng Geng The University of Tokyo A Pilot Study of Applying Sequence-to-Sequence Voice Conversion to Evaluate the Intelligibility of L2 Speech Using a Native Speaker’s Shadowings
14 Nobuaki Minematsu The University of Tokyo Measurement of listening behaviors of learners and raters and its application for aural/oral L2 training
15 Rui Wang University of Science and Technology of China Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding
16 Chenda li Shanghai Jiao Tong University Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement
17 MIKAWA Tamon, FUJII Yasuhisa, WAKABAYASHI Yukoh, OHTA Kengo, NISHIMURA Ryota, KITAOKA Norihide Toyohashi University of Technology Listener's Head Motion Generation Responding to User's Speech and Head Movements Absract
18 Keigo Hojo, Yukoh Wakabayashi, Kengo Ohta, Atsunori Ogawa, Norihide Kitaoka Toyohashi University of Technology Improving the performance of CTC-based ASR using attention-based CTC loss
19 Tatsunari Takagi, Yukoh Wakabayashi, Atsunori Ogawa, Norihide Kitaoka Toyohashi University of Technology Text-only Domain Adaptation for CTC-based Speech Recognition through Substitution of Implicit Linguistic Information in the Search Space
20 Wei Wang Shanghai Jiao Tong University Advancing Non-intrusive Suppression on Enhancement Distortion for Noise Robust ASR
21 Jiaming Zhou Nankai University kNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels
22 Xu Zhang Beijing University of Technology Iteratively Refined Multi-Channel Speech Separation
23 Xue Yang Beijing University of Technology Coarse-to-Fine Target Speaker Extraction Based on Contextual Information Exploitation
24 Yun Liu National Institute of Informatics, Tokyo, Japan Improving curriculum learning for target speaker extraction with synthetic speakers
25 Zelin Qiu The Institute of Acoustics of the Chinese Academy of Sciences Exploring Auditory Attention Decoding using Speaker Features The auditory attention decoding (AAD) approach aims to determine the identity of the attended talker in a multi-talker scenario using neuro recordings.
26 Hui Wang Human Language Technologies Lab, Nankai University Advancing MOS Prediction Systems: Enhancing Accuracy, Robustness, and Interpretability
27 Hongcheng Zhang Tianjin University Efficient Singular Spectrum Mode Ensemble for Extracting Wide-Band Components in Overlapping Spectral Environments
28 Yuqin Lin Tianjin University Enhancing Multi-Accent Automated Speech Recognition with Accent-Activated Adapters
29 Cheng Gong Tianjin University An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios
30 Sun Binbin, Geng Tianqi, Feng Hui Tianjin University A Cross-linguistic Comparison Study on the Prosodic Encoding of Focus and Question Intonation: In the Case of Tianjin Mandarin and American English

Tianjin University