What is Auditory Roughness?

From Vocapedia
This is the approved revision of this page, as well as being the most recent.

What is auditory roughness?[edit]

Any time two fields use similar terms to describe different phenomena, the potential for confusion exists. A wide range of sounds may be described as rough, many of which likely leverage similar innate reactions of our hearing mechanism. In speech pathology, roughness is an undesirable, uneven sound caused by aperiodic noise in the signal.[1] We attempt to rehabilitate this. The auditory roughness explored in this article is a desirable buzzing quality that we tend to cultivate in robust singing.[2] Despite leveraging similar properties of the hearing system, the buzzy aspect of auditory roughness explored here is present in otherwise clean, periodic voicing and is distinct from the roughness one might associate with a myriad of artistic choices, pathologies, or functional deficits. Perfectly periodic voicing may elicit auditory roughness.

Auditory roughness exists on a continuum of perceptual responses that occur when multiple stimuli fall within a critical band of hearing. A critical band of hearing is a way to conceive of the frequency resolution limitations of the basilar membrane. Generally speaking, when immediately adjacent points on the basilar membrane are simultaneously stimulated, auditory roughness is the perceptual result. Different authors explain this phenomenon in different ways. Howard and Angus (2017) consider two tones less than 12.5 Hz apart to elicit beats. When separated by more than about 15 Hz, this percept shifts from beating to a fused percept with a rough quality. As the distance in frequency between these two tones increases, the rough percept disappears.[3] Heller (2013) uses the term self-dissonance or autodissonance to describe both beating and roughness.[4] Heller suggests that one may think of the intervals of higher harmonics of a complex tone in terms of how those harmonics would be perceived as fundamentals of two simultaneously presented fundamentals.

Heller’s conceptual model is slightly challenging to use as a rule of thumb. Consider that two tones at 100 Hz and 106 Hz (approximately one half-step apart) would produce beats six times per second (see Figure 1). In fact, two tones at 200 Hz and 206 Hz will also beat six times per second, though their musical interval is narrower than a half step (see Figure 2). The sixteenth and seventeenth harmonics of a tone at 100 Hz (1,600 Hz and 1,700 Hz respectively) are approximately a half step apart as well, but do not trigger the same beats because their interference pattern pulses at 100 Hz, well within the range of human hearing (see Figure 3). If the idea of autodissonance encompasses all rough and beating phenomena, one must be cautious not to assume that higher harmonics of a fundamental elicit the same percept as similar intervals made by separately produced fundamentals in a lower frequency range. Or perhaps, reserve the term beats to describe phenomena characterized by pulses that fall below the threshold of hearing.

File:Figure1.png
Figure 1: 6 Hz pulsing pattern formed by simultaneous pure tones at 100 Hz and 106 Hz.
File:Figure2.png
Figure 2: 6 Hz pulsing pattern formed by simultaneous pure tones at 200 Hz and 206 Hz.
File:Figure3.png
Figure 3: 100 Hz pulsing pattern formed by simultaneous pure tones at 1,600 Hz and 1,700 Hz.

As mentioned above, when two tones approach very narrow intervals, strong pulses or beats are perceived that relate to their difference in Hz. These beats have a correlate in the physical world and a cognitive aspect. When combined in the air, two tones separated by 6 Hz (e.g., 100 Hz and 106 Hz) will form an interference pattern that pulses six times per second (see Figure 43). As this pulsing is periodic, it could conceivably trigger a pitch percept. However, since 6 Hz is well below the 20 Hz lower threshold of hearing, this is heard as rhythmic beating rather than as a pitch. The rate of pulsing slows as the interval approaches a perfect unison. A less pronounced beating quality (binaural beats) also exists if these two separate tones are played over headphones, one in each ear. This implies a cognitive mechanism may be at play as well.[5]

Exploring auditory roughness experientially[edit]

Auditory roughness may be easily explored with two pure tones created in Madde, Audacity, or any other tone generating program or app. Hold the frequency of the first tone constant at C5 (~523 Hz) and slowly bring the second tone from G5 (~784 Hz) down to E♭5 (~622 Hz). As the second tone comes closer in frequency to the first, notice the percept of dissonant buzzing. If the tones are sufficiently intense the difference tone—a pitch equivalent to the different between those frequencies—may also be heard. As pure tones approach one another in frequency, they begin to produce a beating sound like a metrical pulsing of intensity. This is strongly noticeable at the interval of a half step and smaller in this frequency range. As the tones become even closer, the beats slow down until they disappear at a perfect unison. The buzzing percept will exist well before those beats emerge.[6]

Next, generate a sawtooth wave at the pitch C4 (~262 Hz), again in either Madde, VoceVista Video Pro, or Audacity. Record it into VoceVista Video Pro or Praat (or another audio app capable of pass filtering). Pass filter the lowest five harmonics and notice that they lack roughness. Pass filter from the fifth harmonic (5fo) and higher and notice how buzzy the quality is. For the most part it does not matter the pitch of the sawtooth. This rough division of the spectrum at the fifth harmonic will generally separate the pure sound from the rough sound.

A loose definition of auditory roughness then suggests that in terms of musical intervals, any two separate pitched tones a minor third or closer will generally fall within a critical band of hearing. This threshold is wider at lower frequencies,[7] and is slightly wider than an equally tempered minor third; however, for much of the singable range, the minor third is a useful, if simplified rule.[8] This rule may be applied to harmonics as they appear on a spectrogram. Generally speaking, the percept of auditory roughness in a single voice will always be buzziness, never beats.


[This excerpt is from Ian Howell's forthcoming book, Perception of the Singing Voice: A Pedagogic Model.]

  1. "Voice Qualities". The National Center for Voice and Speech.
  2. This framing is attributed to Bozeman in lectures.
  3. Howard, David M., and Jamie Angus. Acoustics and Psychoacoustics. Routledge, 2017, 84.
  4. Heller, Eric (2013). Why You Hear What You Hear: An Experiential Approach to Sound, Music, and Psychoacoustics. Princeton: Princeton University Press. p. 508.
  5. Chaieb, Leila, Elke Caroline Wilpert, Thomas P. Reber, and Juergen Fell. “Auditory beat stimulation and its effects on cognition and mood States.” Frontiers in psychiatry 6 (2015), 70. https://doi.org/10.3389/fpsyt.2015.00070
  6. Howard, David M., and Jamie Angus. Acoustics and Psychoacoustics. Routledge, 2017, 85.
  7. Vassilakis, Pantelis N. “Perceptual and Physical Properties of Amplitude Fluctuation and their Musical Significance” (PhD. diss., University of California, Los Angeles, 2001), 194.
  8. Sundberg explores this idea in some detail in Sundberg, Johan. The Science of the Singing Voice (Dekalb: Northern Illinois University Press, 1987), 108-9, and Sundberg, Johan. “Perceptual Aspects of Singing.” Journal of Voice 8/2, (1994) 114.

Authored by: Ian Lauchlin Howell

© 1944- National Association of Teachers of Singing, Inc. Reproduction without explicit permission is prohibited. All Rights Reserved.