Κατάτμηση σημάτων φωνής και εξαγωγή θεμελιωδών συχνοτήτων σε ενσωματωμένη πλατφόρμα

Κωστελίδης, Βασίλειος Η.

Master Thesis

Author

Κωστελίδης, Βασίλειος Η.

Date

2012-10-11

Abstract

Two important issues regarding human speech processing is the detection of the presence of a spoken word and the pitch extraction of a speaker’s voice. A system that detects if a word has been spoken (Voice Activity Detector – VAD) can be used in telephone centers, security systems, professional singing systems, in a large number of computer games and in many other applications. Pitch is one of the most distinguished characteristics of human voice. It is the rhythm by witch the vocal chords are vibrating during speech. A pitch extractor can be used for voice recognition in security systems, for the emotion detection [1] of a given speaker, the correction of the pitch in singers, training of singers, video games, speech synthesis etc. The existing pitch extraction algorithms are many. For different applications there are algorithms with advantages and disadvantages regarding the accuracy and the execution time. Both, an accurate voice activity detection and pitch detection in a sound system, are decisive elements for word recognition and speaker recognition. For the voice activity ase, the choice of the algorithm is easy, it is the well-known algorithm from Rabiner [2][3]. For the pitch extraction case, some algorithms were studied through bibliography and their basic characteristics were compared. Τwo of them were selected in order to be implemented. For the purposes of this thesis, an embedded system with an 8 – bit microcontroller was loaded with the fore mentioned algorithms that perform VAD and pitch detection.