Preview

Speech Enhancement Techniques and Their Comparison

Better Essays
Open Document
Open Document
3824 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Speech Enhancement Techniques and Their Comparison
GUI based Performance Analysis of Speech Enhancement Techniques

Mr. Shishir Banchhor Mr. Jimish Dodia Ms. Darshana Gowda Ms. Pooja Jagtap
Student, B.E. (EXTC) Student, B.E. (EXTC) Student, B.E. (EXTC) Student, B.E. (EXTC)
K.J. Somaiya I.E.I.T K. J. Somaiya I.E.I.T K.J. Somaiya I.E.I.T K.J. Somaiya I.E.I.T
Sion, Mumbai-22 Sion, Mumbai-22 Sion, Mumbai-22 Sion, Mumbai-22 skb.shishir@gmail.com jimishdodia@gmail.com gowda.darshana@gmail.com poojajagtap18@gmail.com

ABSTRACT

The speech, being a fundamental way of communication, has been embedded in various applications. The central methods for enhancing speech are removal of background noise, echo suppression or artificially bringing certain frequencies into speech signal. In this project, an attempt has been made towards studying speech enhancement techniques like Spectral Subtraction, Minimum Mean Square Error (MMSE), Kalman and Wiener filter.
Based on our observations and analysis of various performance parameters, we conclude which of the methods is most suitable for speech enhancement. The implementation of the code is done using Graphic User Interface on MATLAB.

Keywords— Speech enhancement, FFT, Spectral subtraction, Kalman filter, Wiener filter, Performance parameters

I. INTRODUCTION

Speech is the fundamental and common medium, hence important for us, to communicate. In general, there exists a need for voice based communications,human-machine/machine-machine interfaces, and automatic speech recognition systems to increase the reliability of these systems in noisy environments. In many cases, these systems work well in nearly noise-free conditions, but their performance deteriorates rapidly in noisy conditions. Therefore, improvement in existing pre-processing algorithms or introducing entire new class of algorithm for speech enhancement is always the



References: [2] Recent Advancements in Speech Enhancement by Yariv Ephraimand Israel Cohen, March 9, 2004.

You May Also Find These Documents Helpful

  • Good Essays

    11.2.1 Study Paper

    • 289 Words
    • 2 Pages

    Reduce the weight of the skull & the resonant chambers that affect the quality of your voice.…

    • 289 Words
    • 2 Pages
    Good Essays
  • Good Essays

    Without proper measures to decrease the amount of noise, the signals from the microphone would be so interfered that the microprocessor would be unable to detect the presence of any sound from the environment. The biggest source of the noise were the motors. To decrease that, the power source of the microphone circuit had to be separated from that of the motor driving circuit. To complete this aim, a total of two pack of batteries were used, as the microphone was powered by its own set of batteries. However, that was not suffice to completely eliminate the noise in the circuit. Considerable amount of noise still existed when the motor started running, which was probably from Arduino board and from the batteries themselves. Without proper treatment the noise would still be able to cause an unexpected cut out of the motors, as they exceeded the threshold that turned the car on and off. Further filtration of the noise in the circuit was completed by the program that made decisions. The first step we took was to increase the voltage of the threshold, so that the probability of a noise reaching the threshold was significantly lowered. However, noise with extremely high voltage still existed. Therefore, another step was taken to enable the microprocessor to distinguish sound from the environment from internal noise. This was done by making sure the car only stops when the signal from the microphone exceeded the threshold twice in two consecutive readings. This method worked because the distribution of noise was sporadic and not continuous, which meant that there was little chance that it could maintain a high value in two consecutive readings of value. In contrast, the noise from the environment made to control the car could last a span of time, which was sure to pass this check and trigger the…

    • 1184 Words
    • 5 Pages
    Good Essays
  • Better Essays

    The speech processor may be housed with the microphone behind the ear, or it may be a small box-like unit typically worn in a chest pocket. The speech processor is a computer that analyzes and digitizes the sound signals and sends them to a transmitter worn on the head just behind the ear. The transmitter sends the coded signals to an implanted receiver just under the skin.…

    • 882 Words
    • 4 Pages
    Better Essays
  • Satisfactory Essays

    Module 8 Review Questions

    • 318 Words
    • 2 Pages

    Speech generating devices are electronic devices that help individuals communicate verbally. Augmentive communication is important because it helps individuals produce or comprehend written or spoken language.These communication devices can be important tools to help children with speech difficulties communicate with parents, teachers, friends, and others in their lives…

    • 318 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    3. Generate binary (0 and 1) bit stream from PCM code number (this bit stream will be used in the later labs). 4. Recover the quantized sample values and replay the wave see if there is any distortion. 5. Repeat the above procedure, changing the number of quantization bits ered voice quality using different le, compare the original wave le to…

    • 333 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    Automatic speech recognition is the most successful and accurate of these applications. It is currently making a use of a technique called "shadowing" or sometimes called "voicewriting." Rather than have the speaker's speech directly transcribed by the system, a hearing person…

    • 416 Words
    • 2 Pages
    Satisfactory Essays
  • Powerful Essays

    Padden, Carol. “GLAD Publishes Position Paper on Cochlear Implants.” Deaf World. Ed. Lois Bragg. New York: NYU Press, 2001. 309-315…

    • 1189 Words
    • 5 Pages
    Powerful Essays
  • Best Essays

    Boswell Susan, . "Cochlear Implants." American speech-language-hearing association. ASHA, 2012. Web. 26 Apr 2012. <http://www.asha.org/public/hearing/Cochlear-Implant/>.…

    • 2191 Words
    • 9 Pages
    Best Essays
  • Powerful Essays

    Automatic Sentence Generator

    • 3412 Words
    • 14 Pages

    1.- Introduction. The growing, unstoppable development of very high speed information processing computers with tremendous main memory capacity which we see today leads us to think that it will be possible to design and construct automatic speech recognition systems which can detect and code all the grammatical components of a training corpus. As part of our effort to make a contribution to the fascinating world of Automatic Speech Recognition, we have developed a system composed of a set of computer programs. We have observed that on the basis of a model of a small corpus made up of sentences in a particular context, we can automatically generate a great quantity of grammatically correct sentences with this context. Also, our system can effect a linguistic discrimination to the point of rejecting, as…

    • 3412 Words
    • 14 Pages
    Powerful Essays
  • Powerful Essays

    Noise is unwanted electrical or electromagnetic energy that degrades the quality of signals and data. Noise occurs in digital and analog systems, and can affect files and communications of all types, including text, programs, images, audio, and telemetry. Nevertheless, the perception of noise does involve a psychological component, so the identification and classification of noise is highly subjective. Sound itself has several differentiating perceptual characteristics; pitch, tone, amplification, which correspond directly with the physical attributes of the sound itself;…

    • 1756 Words
    • 8 Pages
    Powerful Essays
  • Satisfactory Essays

    Sound Technology

    • 639 Words
    • 4 Pages

    3. Timbre is the tone that an instrument or a group of instruments is making. It usually determines what instrument is being played. Timbre is created due to the overtones generated by a specific instrument.…

    • 639 Words
    • 4 Pages
    Satisfactory Essays
  • Good Essays

    Psychology 101

    • 4035 Words
    • 17 Pages

    Description: This course will provide students with a basic and working knowledge of acoustics and the physics of sound. It will provide the basis for measurement and description of speech stimuli. It will have direct application to Speech, Hearing and Language intervention as well as application into communicative sciences.…

    • 4035 Words
    • 17 Pages
    Good Essays
  • Satisfactory Essays

    In 2010, in the Yerba Buena Center for the Arts in San Francisco, Apple co-founder Steve Jobs announced the iPad.…

    • 529 Words
    • 3 Pages
    Satisfactory Essays
  • Good Essays

    Text to Speech

    • 781 Words
    • 4 Pages

    At present most speech synthesis systems use raw text as their input which is understandable from a human point of view but problematic for the machines since the process of converting text to speech is very complex; in this paper we discuss the need for having a specific SSML tag for each “mention” (1st occurrence, 2nd occurrence) of a proper noun in the text or paragraph. We discuss that when a proper noun appears first time in the text, then it is spoken more prominently than its second or third or subsequent occurrence. We highlight the need for incorporating a specific tag in SSML to take care of this mention-case. The SSML format is a compromise between human and machine needs. SSML is often embedded in Voice-XML scripts to drive interactive telephony systems. However, it also may be used alone, such as for creating audio books. The advantage that SSML brings is that the designers of such language generation systems need only understand the basic SSML language and do not need specialist speech synthesis knowledge. Introduction Speech Synthesis Markup Language (SSML) is an XML-based markup language for speech synthesis applications. SSML directs all Text Analysis steps, providing a standard way to control aspects of speech such as pronunciation, acronym expansion, volume, pitch, rate, range, duration, pause, emphasis, etc., across different synthesis-capable platforms. The intended use of SSML is to improve the quality of synthesized content. Different markup elements impact different stages of the synthesis process. The markup may be produced either automatically, for instance via XSLT or CSS3 from an XHTML document, or by human authoring. Markup may be present within a complete SSML document or as part of a fragment embedded in another language, although no interactions with other languages are specified as…

    • 781 Words
    • 4 Pages
    Good Essays
  • Best Essays

    Beukelman, D.R., & Mirenda, P. (1998). Augmentative and alternative communication 2nd ed. Baltimore, MD: Brooke, 30-32.…

    • 3871 Words
    • 16 Pages
    Best Essays