ci_logo

N-Singer : Non-Autoregressive Korean Singing Voice Synthesis

System for Pronunciation Enhancement


Gyeong-Hoon Lee, Tae-Woo Kim, Hanbin Bae et al

{ghlee3401, ktw0114, bhb0722}@ncsoft.com


Abstract

    We propose N-Singer, a fast and robust non-autoregressive Korean singing
voice synthesis system that can synthesize high-quality and articulate pronounced
Korean singing in parallel. N-Singer consists of a transformer-based mel-generator,
a convolutional network-based postnet, and voicing-aware discriminators: 1) For
accurate pronunciation, N-Singer separately models linguistic and pitch information
without other acoustic features, e.g. mel-spectrogram. 2) To achieve improved mel-
spectrograms, N-Singer uses a combination of transformer-based modules and con-
volutional network-based modules. 3) In the adversarial training, we use voicing-
aware conditional discriminators to capture harmonic features of voiced segments
and noise components of unvoiced segments respectively. The experimental results
prove that N-Singer can synthesize a natural singing voice in parallel with more
accurate pronunciation than the baseline.

Contents
  1. Audio Samples (Korean)
Demo page of N-Singer


1. Audio Samples (Korean)


Sentence: 잘 지내는 척 해도 돌아서면 혼자
(Pronunciation): jal jinaeneun cheog haedo 'dol-aseomyeon' honja
ATK (Baseline)
Incorrect pronunciation
N-Singer (Ours) Copy-Synthesis (Ground Truth)
 
Sentence: 쓸쓸히 혼자 걸어보다가 다리에 힘이 풀려
(Pronunciation): sseulsseulhi honja geol-eobodaga dalie him-i pullyeo
ATK (Baseline)
Incorrect pronunciation
N-Singer (Ours) Copy-Synthesis (Ground Truth)
 
Sentence: 한참동안 주저앉아 울기만 했어
(Pronunciation): hanchamdong-an jujeoanj-a ulgiman haess-eo
ATK (Baseline)
Incorrect pronunciation
N-Singer (Ours) Copy-Synthesis (Ground Truth)
 
Sentence: 나 사실 너무 힘들어 잘 지내보려해도
(Pronunciation): na sasil neomu himdeul-eo jal jinaebolyeohaedo
ATK (Baseline)
Incorrect pronunciation
N-Singer (Ours) Copy-Synthesis (Ground Truth)
 
Sentence: 깜빡이며널 기다렸어 무슨 얘길 하고픈지
(Pronunciation): kkamppag-imyeo neol gidalyeoss-eo museun yaegil hagopeunji
ATK (Baseline)
Incorrect pronunciation
N-Singer (Ours) Copy-Synthesis (Ground Truth)
 
Sentence: 화가 나서 소리치듯 가란 내 말에
(Pronunciation): hwaga naseo solichideus galan nae mal-e
ATK (Baseline)
Incorrect pronunciation
N-Singer (Ours) Copy-Synthesis (Ground Truth)
 
Sentence: 오늘 헤어졌어요 우리 헤어졌어
(Pronunciation): oneul heeojyeoss-eoyo uli heeojyeoss-eo
ATK (Baseline)
Incorrect pronunciation
N-Singer (Ours) Copy-Synthesis (Ground Truth)
 
Sentence: 소중했을까 머리 위로 현구름이
(Pronunciation): sojunghaess-eulkka meoli wilo hyeonguleum-i
ATK (Baseline)
Incorrect pronunciation
N-Singer (Ours) Copy-Synthesis (Ground Truth)