NCSOFT Speech AI
Publications
(2024) MultiVerse: Efficient and Expressive Zero-Shot Multi-Task Text-to-Speech, Accepted by EMNLP 2024
(2023) Synthe-Sees: Face based Text-to-Speech for Virtual Speaker, Accepted by ICASSP 2024
(2022) Avocodo: Generative Adversarial Network for Artifact-free Vocoder, Accepted by AAAI 2023
(2022) Enhancement of Pitch Controllability using Timbre-Preserving Pitch Augmentation in FastPitch, Accepted by Interspeech 2022
(2022) Hierarchical and Multi-Scale Variational Autoencoder for Diverse And Natural Speech Synthesis, Accepted by Interspeech 2022
(2022) Adversarial Multi-Task Learning for Disentangling Timbre and Pitch in Singing Voice Synthesis, Accepted by Interspeech 2022
(2021) GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis, Accepted by Interspeech 2021
(2021) FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis, Accepted by Interspeech 2021
(2021) N-Singer: Non-Autoregressive Korean Singing Voice Synthesis System for Pronunciation Enhancement, Accepted by Interspeech 2021
(2021) Hierarchical Context-Aware Transformers for Non-AutoRegressive Text to Speech, Accepted by Interspeech 2021
(2021) A NEURAL TEXT-TO-SPEECH MODEL UTILIZING BROADCAST DATA MIXED WITH BACKGROUND MUSIC, Accepted by ICASSP 2021
(2020) Detecting Mismatch Between Text Script and Voice-Over Using Utterance Verification Based on Phoneme Recognition Ranking, pp. 8264??268, ICASSP 2020
(2020) VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network, pp. 200-204, Interspeech 2020
(2020) Speaking Speed Control of End-to-End Speech Synthesis using Sentence-Level Conditioning, pp. 4402-4406, Interspeech 2020
(2020) Effective Emotion Transplantation in an End-to-End Text-to-Speech System, IEEE Access, vol. 8, pp. 161713-161719, 2020.
(2020) WaveGlowGAN: the bipartite flow based vocoder with generative adversarial networks for high quality speech synthesis (Submitted)
(2020) Improving End-to-end Korean Voice Command Recognition using Domain-specific Text (Submitted)
(2020) Multi-task Learning using Morphological Information for End-to-end ASR (Submitted)