Speech  Synthesis

 

........ À½¼ºÇÕ¼º (Speech Syethesis) ´Â Àΰ£ÀÇ ¸» (speech) À» ÀΰøÀûÀ¸·Î ¸¸µå´Â °ÍÀÌ´Ù. ±×·± ½Ã½ºÅÛÀ» speech synthesizer ¶ó ÇÏ°í ¼ÒÇÁÆ®¿þ¾î³ª Çϵå¿þ¾î·Î ±¸ÇöµÈ´Ù. À½¼º ¼º ÇÁ·Î±×·¥Àº ¹®¼­¸¦ ÀÔ·ÂÇÏ¿© (written input) ÀÚµ¿ÀûÀ¸·Î »ý¼ºµÇ´Â ÇÕ¼º À½¼ºÀ¸·Î º¯È¯ÇÏ¿© Ãâ·ÂÇÏ´Â (spoken output) °ÍÀÌ´Ù. ±×·¡¼­ À½¼ºÇÕ¼ºÀº °¡²û "Text-to-Speech" º¯È¯ (TTS) ·Î ºÒ¸®¿öÁø´Ù .......

À½¼º ÇÕ¼ºÀ̶õ ±â°èÀûÀÎ ÀåÄ¡³ª ÀüÀÚȸ·Î ¶Ç´Â ÄÄÇ»ÅÍ ¸ðÀǸ¦ ÀÌ¿ëÇÏ¿© ÀÚµ¿À¸·Î À½¼º ÆÄÇüÀ» »ý¼ºÇس»´Â °ÍÀ¸·Î Á¤ÀÇÇÒ ¼ö ÀÖ´Ù. À½¼º ÇÕ¼º¿¡ ´ëÇÑ ¿¬±¸´Â ´Ù¸¥ À½¼º¿¡ °ü·Ã ±â¼úµéº¸´Ù °¡Àå ¸ÕÀú ¿¬±¸µÈ ±â¼úÀÌ´Ù. ÃʱâÀÇ À½¼º ÇÕ¼º¿¡ ´ëÇÑ ¿¬±¸´Â ´ëºÎºÐ ±â°èÀû ¶Ç´Â ÀüÀÚȸ·Î¸¦ ÀÌ¿ëÇÏ¿© Àΰ£ÀÇ ¹ß¼º±â°üÀ» ¸ðÀÇÇÏ´Â °ÍÀ̾ú´Ù. Àΰ£ÀÇ ¹ß¼º±â°üÀ» ¸ðµ¨¸µÇÏ´Â °ÍÀº ¾ÆÁ÷±îÁöµµ À½¼º ÇÕ¼º ¿¬±¸¿¡ ±Ã±ØÀûÀÎ ¸ñÇ¥·Î ³²¾ÆÀÖÁö¸¸, ÄÄÇ»ÅÍÀÇ ¿¬»ê ¼Óµµ ¹× ±â¾ï¿ë·®ÀÌ ±Þ¼ÓÈ÷ ¹ßÀüÇϸ鼭 À½¼º ÇÕ¼º¿¡ ´ëÇÑ ¿¬±¸´Â ´Ü¼øÈ÷ Àΰ£ÀÇ ¹ß¼º±â°ü ¸ðµ¨¸µ¿¡ ±×Ä¡Áö ¾Ê°í ¹®¼­Ã³¸® ±â¼úÀ» Æ÷ÇÔÇÑ ¹®¼­-À½¼º º¯È¯ ±â¼ú·Î È®ÀåµÇ¾ú´Ù. À½¼º ÇÕ¼º¿¡ ÀÇÇØ ¸Þ½ÃÁö¸¦ Àü´ÞÇÏ´Â °æ¿ì¿¡ ´ÙÀ½°ú °°Àº ÀÌÁ¡ÀÌ ÀÖ´Ù.

À½¼º ÇÕ¼º ±â¼úÀº ½ÇÁ¦ ÀÀ¿ë ¹æ½Ä¿¡ µû¶ó Å©°Ô µÎ °¡Áö·Î ±¸ºÐµÉ ¼ö ÀÖ´Ù. Á¦ÇÑµÈ ¾îÈÖ °³¼ö¿Í ±¸¹®±¸Á¶ÀÇ ¹®À常À» ÇÕ¼ºÇÏ´Â Á¦ÇÑ ¾îÈÖ ÇÕ¼º ¶Ç´Â ÀÚµ¿À½¼ºÀÀ´ä ½Ã½ºÅÛ (ARS ; Automatic Response System) °ú ÀÓÀÇÀÇ ¹®ÀåÀ» ÀÔ·Â¹Þ¾Æ À½¼º ÇÕ¼ºÇÏ´Â ¹«Á¦ÇÑ ¾îÈÖ ÇÕ¼º ¶Ç´Â ¹®¼­-À½¼º º¯È¯ (TTS ; Text-to-Speech) ½Ã½ºÅÛÀÌ ÀÖ´Ù. ............ (¿À¿µÈ¯ 1998)

term :

¾ð¾î (Speech)    À½¼ºÀÎ½Ä (Speech Recognition)    À½¼ºÇÕ¼º (Speech Systhesis)   À½¼ºÀÌÇØ (Speech Understanding)   (Understanding)   ÀÚ¿¬¾îÀÌÇØ (Natural Language Understanding)   ÀÚ¿¬¾îó¸® (Natural Language Processing)   ÀΰøÁö´É (Artificial Intelligence)

site :

Wikipedia : Speech synthesis

AI Topics : Speech Synthesis

À½¼º ÇÕ¼ºÀÇ FAQ : CMU, Andrew Hunt    À½¼ºÇÕ¼º °ü·Ã web page

Bell lab ÀÇ test to speech systhesis ¿Í overview ¿Í demo

paper :

À½¼ºÇÕ¼º : ¿À¿µÈ¯

À½¼º»ý¼º : Peter Denes. Elliot Pinson

À½¼ºÇÕ¼º±â¼ú °³¹ßÀÇ ÇöȲ°ú °úÁ¦ : À̾çÈñ, ´ëÇÑÀ½¼ºÇÐȸ, 1994

À½¼ºÀÎ½Ä ¹× ÇÕ¼º±â¼úÀÇ ÇöȲ°ú Àü¸Á : ¿À¿µÈ¯, ¿µ³²´ë Â÷¼¼´ë Á¤º¸Åë½Å ±¹Á¦Çмú ½ÉÆ÷Áö¿ò, 2000

À½¼ºÀνİú À½¼ºÇÕ¼º¿¡ À־ÀÇ À½¼ºÇаú À½¿î·ÐÀÇ ¿ªÇÒ : ±è±âÈ£, ´ëÇÑÀ½¼ºÇÐȸ, 1994

ÀÎÅÍ³Ý À¥ÆäÀÌÁöÀÇ À½¼ºÇÕ¼ºÀ» À§ÇÑ ¿£Áø ¹× Ç÷¯±×-ÀÎ ¼³°è ¹× ±¸Çö (Design and Implementation of a Speech Synthesis Engine and a Plug - in for Internet Web Page) : ÀÌÈñ¸¸, ±èÁö¿µ, Çѱ¹Á¤º¸Ã³¸®ÇÐȸ, 2000

Å°ÇÁ·¹ÀÓ ¾ó±¼¿µ»óÀ» ÀÌ¿ëÇÑ ½Ãû°¢ À½¼ºÇÕ¼º ½Ã½ºÅÛ ±¸Çö (Implementation of Text-to-Audio Visual Speech Synthesis Using Key Frames of Face Images) : ±èÁø¿µ, ±è¸í°ï, ¹é¼ºÁØ, ´ëÇÑÀ½¼ºÇÐȸ, 2002

ÆÛÁö º¤ÅÍ ¾çÀÚÈ­±â »ç»óÈ­¿Í ½Å°æ¸Á¿¡ ÀÇÇÑ È­ÀÚÀûÀÀ À½¼ºÇÕ¼º (Speaker-Adaptive Speech Synthesis based on Fuzzy Vector Quantizer Mapping and Neural Networks) : À̱¤Çü, ÀÌÁøÀÌ, Çѱ¹Á¤º¸Ã³¸®ÇÐȸ, 1997

Çѱ¹¾î À½¼ºÇÕ¼º¿¡¼­ À½¿î Áö¼Ó½Ã°£ ¸ðµ¨È­ (Segmental duration modeling for Korean text-to-speech synthesis) : À̾çÈñ, ´ëÇÑÀ½¼ºÇÐȸ, 1996