On Musical Selection in Research: 4. Pitch

Did you hear the one about how harmony is the key to music? That really struck a chord.

If one had to sum up existence in a single word, oscillation would do nicely. Feynman often used the word jiggling, because it’s funnier.

“The world is a dynamic mess of jiggling things if you look at it right. And if you magnify it, you can hardly see anything anymore, because everything is jiggling and they’re all in patterns and they’re all lots of little balls. It’s lucky that we have such a large scale view of everything, and we can see them as things – without having to worry about all these little atoms all the time.”

Richard Feyman, Fun to Imagine, “Rubber Bands,” 1983

You and me and everything we know are merely collections of oscillating particles and waves in various mediums of varying mystery. Our sensory systems detect these jigglings in myriad ways and sensitivities. Comparing, say, olfaction to proprioception would be rather nonsensical, but I think we can make a good case for comparing hearing to everyone’s favorite perceptual process, vision.

Light information is complex, and quite a lot of the brain dedicates itself to perceiving and decoding this data stream, eventually cobbling together a rough approximation of our visual surroundings. While visual information is remarkably important for typical humans, the complexity of this perceptual process means we’re not quite as time-aware of visual information as we’d like to believe. To illustrate this, we’re going to look at visual recreations (movies) vs. sound recreations to see the difference in resolution that tricks the eye into perceiving movement and what it takes to trick the ear into the same thing.

Frame rate, or frames per second (FPS), is a measure of how many still images must be displayed per second to give the impression that they are a smoothly flowing moving scene. At about 10-12 FPS and slower, humans can register each still as individual images[1]. Higher frame rates start appearing as seamless motion, though at first it appears quite choppy. A standard US film has a frame rate of 24 FPS to avoid the choppy/sped-up effect of older films. The faster a frame rate, the smoother the illusion appears, as depicted here:

A great, simple demonstration of four different frame rates.

Perceptual sound information is objectively less complex, and this means we process this data stream more directly through the auditory system. This allows for a more rapid translation of external oscillations into auditory temporal neural signals. People who edit both film and audio actually deal with this in a practical way all the time. Visually, snapping edits to 24 per second is perfectly fine, but that’s only a little over 40 ms, which is enough to sound out of sync to most listeners. To try this yourself, just get a drummer to play a drum machine with a 40 ms latency on the drum sounds. This demonstrates a concept which I have taken to calling:

Cognitive Temporal Resolution

Compare the frame rate video above to the below demonstration of three different audio sampling rates.

Examples of 44.1 kHz, 22.05 kHz, and 11.025 kHz digital audio sampling rates.

Because their oscillations are quicker, and thus closer together, higher frequency instruments like cymbals are a dead giveaway when it comes to sound quality. Sample rate is the number of samples taken per second in a digital audio recording. It looks like this:

Visualization of sampling rates. X-axis is time.

If you’re interested, you can learn more about sampling rate, bitrate, and bit depth here:

You’ll note that we discuss these audio resolutions in the tens of thousands, as opposed to the couple dozen or so required in temporally arranged visual information. Heck, humans can drum roll faster than film frames fly by, and it still sounds like individual drum hits, rather than tricking our brains like a fast visual frame rate does. Or for a bit of fun, take a listen to these accelerating snare hits, which begin at 16th notes on a lazy 60 BPM and steadily accelerate.

808 snare 16th note accelerando from 60 BPM to 999 BPM

First, a rising bass note creeps in, then, toward the end of the clip, the individual snare hits grow indistinguishable. At this point, we perceive them as an audible pitch.


Frequency is a physical attribute, a measure of the number of oscillations within a medium over time. Pitch is the human perception and subsequent analysis of frequency. In English and most cultures, we think of this process as the ability to arrange stimuli along a spectrum of high and low tones.

The average range of a young person’s hearing is roughly 20 – 20,000 Hz. Just a fun fact, the reason standard sampling rate is 44,100 Hz is because it needs to be about double whatever frequency it’s representing to accurately convey one full wave cycle. So, the highest frequency a sampling rate of 44,100 p/s can covey is 22,050 Hz, which, you’ll notice, is higher than the human range of hearing. You can read more about higher sampling rates and the perception of digital audio with virtual instruments for a bit more nuance on that story.

As you age, the top frequency of your hearing range lowers. You can test where you’re currently at here if you have even marginally decent speakers. Bass is actually pretty tough to test at home due to practical factors like speaker size and placement, standing waves, room tuning, etc., but testing higher frequencies is straightforward. I’m 36 years old, and my hearing cuts out somewhere between 16,000 and 16,500 Hz.

Below, you’ll find a chart that shows average spectral ranges for standard Eurotraditional musical instruments. Take things like this with a grain of salt, but it’s a nice visualization nonetheless.

Go here for higher resolution, or here for an alternate interactive version

This is a mixing guide, hence the handy breakdown of ranges along the bottom. Notice also that after around 4 or 5 kHz, humans lose the ability to extract pitch information from sound [1]. So, everything we’ll be talking about for the remainder of this section will deal mostly with sounds that occur below that shelf.

Pitch Perception

I’m going to draw a lot of the following info from the book Pitch: Neural Coding and Perception, which includes not only a wealth of useful information, but also the following list of unanswered questions:

  1. How is phase-locked neural activity transformed into a rate-place representation of pitch?
  2. Where does this transformation take place, and what types of neurons perform the analysis?
  3. Are there separate pitch mechanisms for resolved and unresolved harmonics?
  4. How do the pitch mechanisms interact with the grouping mechanisms so that the output of one influences the processing of the other and vice versa?
  5. How and where is the information about pitch used in object and pattern identification?

A basic flowchart of auditory signal flow.

Pitch: Neural Coding and Perception, Plack, Oxenham, Fay, 2005

Notice all the question marks? This is just another example of how little we truly know about how pitch gets perceived.

That said, here are some things we do know. Frequency is related to loudness, which has been mapped using the equal-loudness contour chart. Low frequencies are processed differently than high frequencies, although the exact mechanism for how this works is still mysterious. This loudness contour is very much in-line with our inherent auditory preference for vocals. Pitch perception involves top-down processing [1]. It’s probably both top-down and bottom-up to some extent. Top-down essentially means “cognitive,” while bottom-up means “perceptual,” although I’m sure plenty scientists would argue with that semantic simplification. You might also consider them to mean “analytic” and “triggered” respectively.


You’ll recall from part one how amorphous and subjective music harmony is as a theoretical concept. This may be why so much research regarding it is rather amorphous and unspecific, and often at odds with each other. See examples of this here, here, and here.

The concept of harmonic structure (especially in Eurotraditional theory) is based on the idea of an interactive tension-resolution cycle between consonance and dissonance. I’m going to go with a pretty book on this phenomenon, Neurobiological Foundations for the Theory of Harmony in Western Tonal Music. Much of the below info is paraphrased or quoted from that text.

Consonance generally means a bunch of tone harmonics line up. This means neurons that are used to firing/vibrating/resonating in sync, do so. The brain initially likes this, because it matches what it often hears in nature.

Dissonance (or roughness) means a bunch of neurons either can’t figure out if they should resonate together or are, somehow, cognitively bumping into each other, so to speak. The brain initially gets annoyed by this.

However, if you play too many sequential harmonies considered consonant by the listener, it often starts to sound boring. As we’ve already covered, brain hate boring. To avoid this static state, we arrange harmonies of varying dissonance/roughness/tension for a while before resolving that tension with a consonant harmony. The most relieving voicing of that chord will usually have the key’s fundamental pitch (the tonic note) in the bass part. This is (probably) because we are cognitively expecting the fundamental due to learned real-world neural clusters. Thus, hearing this resolution engages the neurochemical expectation/reward system. Chord progression (along with volume changes and the entrance of a human voice) has been strongly tied to the chills response or frisson often studied in musical neuroscience, because it’s easy for test subjects to check a yes/no box about whether they have experienced it.

A demonstration of hearing a sounded vs. missing fundamental.

Harmony can be thought of as having both a vertical and horizontal dimension. This is directly analogous to music arranged along a staff. If two or more pitches are played at a time, this is the vertical dimension. Doing this in succession over time is the horizontal dimension. The horizontal dimension involves psychological priming, meaning perception of one stimulus affects how those following it are perceived, and strongly favors top-down processing. The time window over which sound information is integrated in the vertical dimension spans about a hundredth of a second to a few seconds, i.e. from sixteenth notes to tied whole notes at 120 beats per minute. Thus, many minimalist pieces fail to register as true harmonic progressions due to their chord changes falling outside of this perceptual window.

If you’re interested, I really like the City University of Hong Kong’s online resource for auditory neuroscience website, which has an excellent list of articles regarding harmony and pitch as well as resources for other musical elements. This discussion handily leads us into one of the most interesting phenomenons that arise from music perception.

The horizontal dimension deals with both successive chords/tone clusters (harmonic progressions) and individual tone lines (melodies) almost always in the topmost voicing, which leads us nicely into the next pitched musical element.


Melody is the combination of rhythm and pitch, a linear succession of tones perceived as a single entity. A melody as a single unit emerges following top-down processing. You can have a melodic bass line, but due to the aforementioned difference in processing low and high frequencies, a lot of this information is processed rhythmically instead of melodically. Extracting tonal info from purely sub-bass sounds is difficult for the average listener, as opposed to bass tones with lots of higher-frequency information such as would be achieved with bass effects pedals, for example.

The function of melody is closely tied to memory and expectation. You can imagine melodic expectation as a sort of cognitive temporal probability cloud constantly running during active listening. Melodic expectation is probably learned rather than innate and depends on cultural background and musical training. The process for expectancy of harmonic and melodic info is additive, i.e., we compute the rhythmic pitch line and harmonic info together to analyze what psychologists call – and I’m not making this up – a music scene.

Unfortunately, psychologists never specify which music scene.

The multitude of melodic perception/expectation models mostly describe different aspects of the same system. One, called melodic segmentation, is used in computational analyses and uses formulae to automatically separate melodies into small segments, just like the short repeating units in postminimalist music. A well-known researcher Jamshed Bharucha helped pioneer the fascinating concept of melodic anchoring, which describes tone stability and instability based on where the pitch falls in a harmonic context. Repetition helps us commit complete pleasing melodic lines to memory, which is the basis for the concept of the hook used in folk and pop music. This also likely cements how we expect later melodies to progress, creating a sort of taste-feedback-loop.

Rhythmic pulses and harmony form the basis of music like Reich’s Electric Counterpoint mentioned in part three. Later in the same piece, melodic segmentation provides discrete repeating melodic note groups, approaching but not quite arriving at a traditional melody. Now, with the addition of true melodies, we are able to build the majority of music. All that’s left is determining the context and tradition of what we want our music to sound like. This involves choosing the remaining elements of the musical cake: timbre (instrumentation, production, loudness), overall structure/form, and in many cases lyrical content.


This word is loaded and the source of much contention [1][2][3][4]. It is a purely social and semantic method of organizing sound. Like many categorization methods, it is subjective and rife with overlap, gray areas, and fusion practices which then give rise to new definitions that further muddy the waters. But it is directly related to the context or tradition of how music is received and so shall be addressed.

Genre terminology is best used as shorthand for music discussion. “Alternative rock with funk elements” invokes alternative, rock, and funk to quickly communicate what to expect when listening to the Red Hot Chili Peppers. One could could then sonicly associate similar bands like Primus, Living Color, Lenny Kravitz, etc., highlighting the communicative advantages of the system. Genre preference is often a form of sociocultural identity, making it an important aspect of human existence.

Genre preference involves an individual’s relationship to the familiar and the unfamiliar. Musical training/expertise strongly affects a listener’s reaction to unfamiliar music. The more expertise in an individual, the more likely they are to enjoy unfamiliar music likely due to better-rehearsed top-down processing. Music novices are better able to perceive elements of familiar music, which adds to music enjoyment.

Many publications, critics, theorists, and analysts now decry the obsession with genre in popular music award shows. The Music Genome Project is an initiative by the founders of Pandora Radio that automates playlist generation by way of seed association. This means the listener chooses a song, album, artist, or genre which is then examined for attribute keywords that the project considers genre-definers. This is achieved with the help of a team of analysts assigning attributes to individual songs. While the exact list is a trade secret, you can view an approximation of the 450 [1] attributes listed here alphabetically, and here by type. For the purposes of this article, one things stands out:

The vast majority of attributes in the Music Genome Project describe melodic elements.

Regardless of how much we talk about the importance of form and rhythm in music, what really defines how a person receives and associates with music is its pitched content. While rhythmic cadences are vital in genre definition, attributes relating to vocals, lyrics, pace/complexity of harmonic progression, instrumentation, timbre of instrumentation (including things like guitar distortion levels), and so on dominate the list.

The direct mechanical effects of rhythmic auditory stimuli (such as on the motor cortex or cardiovascular system) do not seem to exist with regards to most pitched and harmonic information, with the exception of low frequencies that register in part as perceptually rhythmic. To sum up, melodic/harmonic pitch perception:

  1. Takes longer to develop in humans
  2. Relies more on learned environmental patterns
  3. Displays greater diversity across cultures
  4. Promotes greater subjectivity in the listener

How and why these aspects all tie in together is grounds for exciting research. It also suggests an explanation for why the pentatonic scale in particular is so universal. The fundamental with harmonic overtones matching this scale will be found in naturally occurring tones more often than other spectral architectures, means that these expectant neuron clusters will form in humans regardless of background. Whether there is a deeper or more innate framework for this other than world experience growing up is not yet known.


Musical structure is memory. A list of musical movements/genres is also largely a list of popular structures throughout history. Song structure lends itself quite easily to analysis, which is why I’m not going to rehash it too closely here. The most important thing to know is that composers use regular returns to musical events to ground listeners in familiar territory before deviating again into new content.

Most of the following is paraphrased or directly quoted from Bob Snyder’s excellent book, Music & Memory: An Introduction.

Memory is the ability of neurons to alter the strength and number of their connections to each other in ways that extend over time. Memory consists of three processes: echoic memory and early processing; short-term memory; and long-term memory. This modern hierarchical concept of memory aligns with Stockhausen’s three-level model of timefields. Each of these three memory processes functions on a different time scale, which Snyder then relates to listening as “three levels of musical experience.”

  1. Event fusion level (echoic memory/early processing)
  2. Melodic and rhythmic level (short-term memory)
  3. Formal structure level (long-term memory)

The initial echoic memory sensations decay in less than a second. This experience is not analyzed during this level, but rather exist as a raw, continuous stream of sensory data. Our friend Dr. Bharucha helped define the specialized groups of neurons that extract some acoustic data from this continuous stream, a process called feature extraction. Such features include pitch, overtone structure, and presence of frequency slides. These features are then bound together as coherent auditory events. This information is not the continuous barrage like in echoic memory, meaning the amount of data is greatly reduced. Together, feature extraction and perceptual binding constitute perceptual categorization.

Snyder’s memory model of auditory perception.

After extracted features are bound into events, the information is organized into groupings based on feature similarity and temporal proximity. These can activate long-term memories called conceptual categories. Such memories consist of content not usually in conscious awareness, which must be retrieved from the unconscious. This can take place either in a spontaneous way (recognizing and reminding) or as the result of a conscious effort (active recollecting). However, even when recalled, these memories often remain unconscious and instead form a context for a listener’s current awareness state. This is called semiactivated, meaning they’re neurologically active and can affect consciousness (emotional state, expectation, decision-making, etc.) but are not actually the present focus of cognitive awareness.

If information from a long-term memory becomes fully activated, it becomes the focus of conscious awareness, allowing it to persist as a current short-term memory. If not displaced by new information, these can be held for an average of 3-5 seconds in typical humans. After this time window, it must be repeated or rehearsed internally, i.e. consciously kept/brought back into focus, or it will decay back to long-term memory. The more striking the information in question, the more likely it is to more permanently affect this system by creating new long-term memory information.

There is a constant functional interchange between long- and short- term memory. This is the basis of formal structure in music.

Pitch information is extracted in auditory events that take place in less than 50 milliseconds. This races by as part of the data stream which is not processed consciously.

Events farther apart than 63 milliseconds (16 events per second) constitute the aforementioned melodic and rhythmic level of musical experience. Since these occur within the 3-5 second window of short-term memory, we consider separate events on this timescale as a grouped unit that occur in the present. This time window is essentially a snapshot of consciously perceived time. We parse this musical perception level in two dimensions: melodic grouping according to range similarity, rising/falling motion, and reversals of that motion; and rhythmic grouping according to timing and intensity. Perception information events received within this window are considered by the brain to be available all at once.

Events intervals lasting longer than 5 seconds (roughly, depending on the individual and expertise/training) fall into the category of formal structure. Here, our expectations are manipulated to allow auditory events to fall into unconscious long-term memory. This manipulation activates our limbic reward system, and that feeling is stronger in music we find familiar.

This is how musical structures rely upon the three levels of memory and traditional genre expectations to manipulate the dopaminergic cycle of expectation and reward. Genres/styles/traditions achieve this goal by techniques such as symmetric bar structures, removal and return of repetitive catchy (often sung) melodies/hooks/themes/ostinati, and drastic changes in the volume and presence of vocals and instruments.


Obviously, pitched information is vast and malleable and cannot be truly summarized in a series like this. Many, many books have been written on it, though in terms of cognitive structures much of it remains a mystery. I hope this has at least piqued your interest to some extent. Next will come the final section of this series.

Thanks for reading!

On Musical Selection in Research: 3. Rhythm

Did you hear that existentialist percussion piece? It was highly cymballic.

We have previously discussed arrhythmic music in reference to ambient and minimalist compositions. There are many other types of music without a pulse, such as noise, ambient, and drone music. These fall either in the realm of outliers, or serve the same purpose as minimalism, and can be analyzed similarly as the previously mentioned works.

So, we come to a fruitful realm of neurological study, namely how the brain responds to rhythm. Life itself happens rhythmically, and some sort of rhythmic action or resonance can be found within the basic functions of all life on earth. Advanced beat perception is a direct indicator of higher brain function, in particular how a nervous system relates auditory rhythmic stimuli with locomotion. In other words, very few species can dance.

In scientific parlance, synchronizing motor processes with auditory cues is called entrainment. When you tap your hand along with a metronome, for example, you have entrained to a repetitive auditory stimulus. The word comes from literally stepping onto a train from a stationary platform. To remember its meaning initially, I imagined the train cars as a beat clacking by, then stepping onto the train as a way to synchronized with that steady movement.

When musical entrainment goes too far.

Entrainment with a beat is difficult within the animal kingdom, for reasons still not entirely understood – but there are some well-researched clues. In this excellent and succinct presentation by Drs. Aniruddh Patel and Ola Ozernov-Palchik, they postulate that vocal mimicry is the key to whether an organism is capable of the kind of temporal perception/anticipation necessary for accurate rhythmic entrainment. Their research shows a strong connection between aptitude at reading, speaking, and rhythmic discrimination in children. After all, language itself is temporal, and processing the time-information of speech is inherent to understanding what has been said.

Researchers in the last few years were a little surprised to discover that non-human primates are pretty bad at rhythmic entrainment to music. Even those that are capable of it take a long time to learn the skill, and tend to always be a little a late. We think this is due to a lack of neural motor region coupling to auditory regions, meaning these animals can only judge a beat by determining the interval between pulses, instead of predicting/anticipating like humans do.

Also, accurate beat processing may require a vocal mimicking or learning ability. This means that animals born with only innate vocalizations, and do not obtain more throughout their lives by mimicking what they hear, will be unable to accurately entrain movement to music.

The most prominent non-human vocal learners in nature are songbirds, which have quite adorably been proven to be able to spontaneously dance to unfamiliar music at varying beats. An example is Snowball the Dancing Cockatoo, who deservedly went viral a few years ago dancing to the Backstreet Boys. Snowball caught the attention of Dr. Patel, thus beginning a lively ongoing discussion as to what animals can and can’t perform this feat, and why or why not.

When they said “Everybody,” they really meant it.

The list of dancing animals is currently humans, songbirds, elephants, and most recently sea lions. Bonobos are an interesting soft exception, which you can read more about in this article, Beasts That Keep the Beat. There’s also a little more here regarding the relationship between motor anticipation, motor learning, temporal perception, and social engagement, and how these factors relate to beat processing skill in various species.

These points are important, because they represent direct and testable connections of how rhythmicity and beat processing is acheived in the mammalian brain. The emerging story is that, while musicality has long been associated with neurologic linguistic processes [1][2][3][4], it is by no means the full story. Music activates very deep brain structures that are not all tied to language, but rather only tied to language in part.

For now, we can summarize human rhythm perception as relating to vocally-motivated temporal discrimination, motor planning and execution, and social engagement. Each of this elements can exist separately from one another in non-human animal species, but exist intertwined in a, dare I say it, cognitive dance in the human brain.

Social Drumming

As was mentioned previously in this series, drum circles have a long and well-studied history of helping foster positive communal feelings, and have been shown to help with such issues as at-risk behaviors, depression, anxiety, and addiction. The history of group drumming and dancing touches virtually every culture on Earth, long before the advent of modern psychology, and were often considered to be explicitly therapeutic or cleansing in nature.

As I’ve also mentioned, while we know communal dancing and hand drumming have therapeutic effects for many individuals and communities, we don’t know how or why that particular ritual differs from, say, group discussion, eating, or other communal events. What is it about the act of rhythmic communal cooperation that works so effectively? I’ve not been able to find a single rigorous controlled study that distinguishes why (or if!) the rhythmic nature of such communal therapies is necessary for why they work. But we can infer some of that relationship by starting with what’s called the auditory brainstem response. This is a phenomenon in which we can directly correlate auditory stimulus with corresponding electric signals in the brainstem. The relationship is electrically direct, and, in my opinion, totally awesome.

Auditory brainstem response
Examples of the auditory brainstem response. More examples.

Pay particular attention to the example labeled Piano Melody. Notice that the electric impulses in the brainstem line up perfectly with the steady pulse of the piano notes. Such synchronized impulses mark the beginning of a vast connection to many centers of the brain, as laid out in this paper by our trusty friend, Dr. Thaut.

I can in no way completely map the path of a stimulus from the moment it travels through the cochlear mechanism into the brainstem in a humble blog post such as this. I will, however, mention some key points about the process.

The Auditory Brainstem Response

After synchronized pulses to auditory stimuli occur in the brainstem, electric signals travel on to regions such as the spinal cord, the subcortex and cortex, strongly interacting with the motor system. The brainstem itself regulates most of the basic cyclical body functions, such as cardiac and respiratory processes. It’s also pivotal in maintaining consciousness and regulating the sleep cycle [1].

The Chanda/Levitin paper lists several ways we know the tempo of music affects our physiology. Like so many other studies, they could find no direct neurochemical relationship to “relaxing music,” but they did find a strong correlation between pulse tempo and rhythmic bodily processes.

“These effects are largely mediated by tempo: slow music and musical pauses are associated with a decrease in heart rate, respiration and blood pressure, and faster music with increases in these parameters. This follows given that brainstem neurons tend to fire synchronously with tempo.”

“The Neurochemistry of Music, Chanda, Levitin, 2013

Heart rate and breathing are directly related to our emotional state at a given time, not only for speed but also regularity. Unfortunately, much of the texts related to are pseudoscience, particularly with regards to treatment, which coincidentally often relates to making money. However, some highly regarded researchers including Bessel van der Kolk have observed that PTSD is strongly correlated to heart rate and breathing variability, resilience, and coherence. The Chanda/Levitin paper demonstrates that these metrics are directly regulated by rhythmic musical stimulus. In other words, cultures have been utilizing rhythmic music to regulate physiological, neurological, and neurochemical stress responses for many, many years via group drumming as a form of therapy.

Also worth mentioning here is a type of rhythmic therapy called EMDR, which uses eye movements or knee/thigh tapping to impose repetitive bilateral distraction on neural processes. Again, as is common with such things, the concept has been appropriated by pseudoscience, quackery, and exploitative monetization. But there is a growing body of sound evidence showing its efficacy, although how and why it works is still not fully understood [1][2][3].

EMDR stands for “Eye Movement Desensitization Reprocessing.” In it, the therapist moves their fingers back and forth in a steady rhythm while the patient follows this with their eyes. This has been shown to reduce the amount of stress a person feels while recounting traumatic memories. Tapping refers to repetitively tapping on the body, usually the knees or thighs, to achieve a similar goal. I spent some time watching various videos of EMDR and tapping sessions, such as this EMDR one from Dr. Jamie Marich and this resource tapping one from Dr. Laura Parnell (a major proponent of tapping-style EMDR).

I paid a lot of attention to the tempi used in these demonstrations, clocking the rates as within a narrow range usually hovering around 85-95 BPM. Dr. Marich completes one cycle (back-and-forth) at about 88-90 BPM, while Dr. Parnell is slightly faster, at about 90-94 BPM. This makes sense, since using two hands to tap is more efficient than using one oscillating hand. The slowest tempo I could find was about 70 BPM, and none went faster than 100 BPM. So the tempo is always slow, which as we know can help keep things like heart rate and breathing down. I can only imagine it would be stressful if someone started hammering on your knees at 160 BPM and asked you to recall particular memories. Woefully, I can’t find much rigorous literature discussing the different effects various speeds have on this technique.

The crux of this technique and how it functions is called bilateral stimulation. This means continually activating both hemispheres of the brain using a left-right alternating pattern. Eye movements are a very easy way to activate different regions of the brain, which is (partly/mostly/largely) why they happen as we dream in REM sleep [1][2].

Bilateral tactile sensations have a similar effect to moving the eyes back and forth. If you tap on your left leg with your left hand, and your right leg with your right, alternating at a steady pace, it will also alternate hemisphere activation. There is some evidence that this can effectively reduce the stress response as well. All of which is summarized in this quote from Dr. Robert Strickgold:

“We propose that the repetitive redirecting of attention in EMDR induces a neurobiological state, similar to that of REM sleep, which is optimally configured to support the cortical integration of traumatic memories into general semantic networks.”

EMDR: A putative neurobiological mechanism of action, Strickgold, 2001

The repetitive distraction induces a dreamlike state, is what he’s saying. Remember all that talk about attention earlier in this blog series? Contemporary psychotherapists are developing a verified method of neural engagement that denies the brain the resources it needs to drum up (har har) fearful or anxious thoughts. To reference the loudness paper again, takes up the neural space that might otherwise promote negative arousal.

There are many, many studies on the ability of repetitive drumming, whether listening, dancing, or playing, to induce a trance-like state [1][2][3] that is similar to hypnosis, all of which falls under temporary, non-pathological dissociation as a form of self-regulation. Which is a hell of a sentence.

“Music emerges as a particularly versatile facilitator of dissociative experience because of its semantic ambiguity, portability, and the variety of ways in which it may mediate perception, so facilitating an altered relationship to self and environment.”

An empirical study of normative dissociation in musical and non-musical everyday life experiences, Herbert, 2011

Now, imagine hand drumming in a circle of friends, colleagues, similarly trained musicians, communally associated peers, or whatever. Hand drumming, by definition, is a self-controlled rhythmic bilateral stimulation in which the player’s hands tactically engage in a steady alternating pattern. This regulates the player’s nervous system by entrainment, which is also synchronous with the surrounding players and audience. One can visually confirm this by observing head bobbing, side-to-side swaying, foot-tapping, etc. in surrounding participants, all of which are also bilateral movements. I can’t find any good papers specifically about how this confirmation of social synchronization affects the brain, which is surprising. Human social behavior is deeply rooted in our brain [1][2][3], so it seems like such a study would be a… er… no-brainer. But one can at least guess that this process involves temporal anticipation/prediction/discrimination, as well as motor planning and execution, all confirmed visually, aurally, and thus socially, which cannot be achieved by similar but arrythmic activities.

Bilateral entrainment, Feynman-style.

The point is that we are already using hidden versions of engaging and distracting rhythms during psychotherapy as a way to deal with overwhelming stress and other negative emotions, memories, and tendencies – a contemporary refinement of what we were already doing for millennia. I personally wonder if nervous rhythmic actions that produce sound, like idle/anxious knee or desk tapping for example, fits into this picture somewhere.

All this suffices to say that rhythmicity in music arises from a deeply innate human architecture, the so-called glue that binds together the vast majority of human-organized sound.

Types of Rhythm

I’ll now switch track to discuss an infinitesimally small pool of examples showing the various ways rhythm arises within music. Rhythm in a sonic work can be explicitly percussive – i.e. in drum ensembles – such as in an Ewe, samba, or taiko ensemble. You’ll also find it in solo pieces this one for Korean jang-go, as well as this contemporary percussion composition in the raga style, Piru Bole. However, rhythm exists much more than percussion music. Rather, it’s a necessary element in anything with a meter or pulse, whether or not it’s actually percussive in timbre. This makes the effect of rhythm on the brain extremely diverse. Let’s try and characterize some examples using the cake method.

In the above example, we hear a rhythm played via rock drum kit solo. It possesses no pitched or harmonic content. The tempo is very fast, accelerating to around 150 BPM, and contains rapid subdivision and snare rolls. The timbre is obviously percussive, utilizing the full range of the kit, creating a wide spectral texture from the bass drum to the cymbal crashes.

Something rather surprising is that no existing scientific study I can find relates brain activity to drum solo listening. The reason is probably that a drum kit is actually a collection of instruments played simultaneously, making it too difficult to extrapolate meaningful data. Instead, researchers tend to look at rhythmic sounds one at time. This presents the problem of deciding which texture to use when studying rhythm. There’s probably a big difference between how the brain receives the same beat pattern depending on whether it’s middle C on a piano, a sine tone, a metronome click, a bass drum, or a crash cymbal. Indeed, simply the volume of a kick drum has an effect on bodily entrainment. This is is why Garth’s solo up there, as it currently stands, is completely outside of the realm of existing research.

Rhythmic content is fast and complex, accelerando to ~150 BPM. No pitched, melodic, tonal, or harmonic content. Standard drumkit timbres with full spectral range. Quite loud. Structure is through-composed freestyle solo, short in overall length. Live improvisatory rock context.

The first ~1:40 of this track is an arpeggiated melodic line generated with a Roland TB-303. Despite having no sounds that are traditionally percussive, the rapid pace, fast-attack wave shapes, and jumps between low and high pitches creative a highly rhythmic, dancey feel without the need for synthesized drums. Which, of course, makes the composition all the more effective when the drums do come in, then go out, then come in again several times.

Rhythmic content is rapid 4-to-the-floor dance beat. Melody is repetitive, fast, and arpeggiated, containing large leaps, making non- or nearly non-singable. Harmonic content is enharmonic slow, soft pads. All timbres are computer generated/synthesized in the style of acid house techno. Dynamics alternate between quieter sections and loud beat sections, which play out over an unusually long runtime of 16 minutes. Club/raver/electronic dance tradition.

The second movement of Ravel’s sole string quartet composition is a really great example of why music selection is so important when conducting research. The beginning section is highly rhythmic and exciting, featuring constant pizzicato techniques and arpeggiation, just like the previous acid techno track. It also counts as a classical composition and employs so called “relaxing” timbres, namely, strings. When a research paper says it used classical music as a relaxing control, what if this piece snuck its way in? Compare this to Adagio for Strings by Barber. If we don’t know the tempo, timbre, or general feel of each individual piece, both might be characterized as similar in a research environment, although I can guarantee both would activate quite differently in the brain. Indeed, this is a perfect example of why each aspect of a musical selection must be indicated in research, because, generally, this track shares major similarities in some ways with the acid techno track, and other major similarities with Adagio for Strings, but all three pieces would be received quite differently by a listener.

Focusing only on the first movement: Highly rhythmic, achieved via constant pizzicato eight, sixteenth, and 32nd notes. Highly melodic, recurring singable theme. Enharmonic key structure with Euroclassical modulations achieved mostly with arpeggiations between players. Timbres are full-spectrum pizzicato and arco strings as in a traditional quartet. Structure, dynamics, and context are variable as is consistent with the Impressionist Eurotraditional artistic epoch.

Percussive melodic instruments such as the marimba demonstrated above, as well as the xylophone, gamelan, or vibraphone (to name only a few) are very pure combinations of rhythm and pitch. They function by producing a tone with a relatively pure timbre (i.e. closer to a sine wave) and seem to be used about as often as metronome clicks when studying beat processing. I can’t find a study that looks at the difference between pitch and unpitched perception along this axis. Which is weird.

Rhythmic content is fast and steady, highly melodic, enharmonic Eurotraditional key structure, wooden malletophonic timbre and spectral range, relatively consistent medium volume, Baroque tradition.

And so, with Steve Reich’s Clapping Music, we finally venture into the realm of postminimalism and phasing music. We’ll get slightly more into what that means in a second. But I want to focus on this particular piece because it’s another really good example of the difficulty of music selection without being very specific. First of all, this is a highly rhythmic piece with no vocals or pitched content. However, it also very clearly has an human element. The brain may react completely differently when it can tell a percussive sound comes from a human rather than a manufactured instrument, in the same way the brain responds completely differently to vocal sounds compared to otherwise similar sound content.

Instead of breaking it down, let’s instead take a look at how different this piece sounds based on the recording, like this fast-paced ensemble version of Clapping Music. It sounds completely different from this version. Or look at Evelyn Glennie performing it on woodblocks, or these super cool jugglers playing it slowly with bouncy balls, or, my favorite, actress Angie Dickinson performing the piece in the 1967 film Point Blank. They’re all the same piece, but some feature extremely different tempi and timbres, likely provoking strong variations in the neurological listening experience. So, even if a researcher actually mentions the name of the composition (which they don’t always do, at all), unless we know which specific recording of that piece is used, we still can’t rely on any data presented.


We began with Cage’s silent music, then added arrhythmic, meditative harmonies (with some naturalistic textures) to build a minimalist composition. Now, by adding a rhythmic element to minimalist practices, we create the genre (or subcategory) known as postminimalism, also sometimes called phasing music.

In most minimalist music, just like the Pisaro piece or any of the ambient works, slow harmonies progress over a long period of time. In postminimalism, the same process occurs but with an added pulse, and usually short snippets of interlacing melodies that phase in and out. The intro section of Steve Reich‘s Electric Counterpoint is a really perfect example of how this compositional technique arises directly from a true minimalist approach.

Performed beautifully here by Mats Bergström.

The introductory choppy chords eventually move into melodic snippets that fade (or phase) in and out, giving sensation of movement while retaining the slow-moving harmonic structure valued by minimalism. However, by avoiding any longform, easily singable melodies (short, quick notes with large leaps) we avoid true ostinati/thematic content like in other genres.

As a note, if you read descriptions in the links, you’ll notice that the term postminimalism is sometimes lumped in with minimalism, though in contemporary music practice this is considered inaccurate, or perhaps just plain old lazy.

I am very personally interested in the difference between rhythm as accorded by something like Electric Counterpoint vs. a West African Ewe ensemble or Garth’s drum solo up there. Both contain strong rhythmicity, but one comes only from pitched/chordal content and the pulse is very clear, i.e. comprised entirely of eighth notes and eighth rests. The others are unpitched and feature far more complexly interlaced syncopation. Would there be some marriage of stimulus in the Reich piece that we wouldn’t see in the unpitched one? Is the strength of response different by the nature of how rhythmicity is achieved in one or the other? Or the spectral range or waveshape of either? Does the harmonic content neurally allow one to be more repetitive than the other before boredom sets in?

Absolutely none of these questions have been addressed in any research I can find. I think it has a lot to do with the fact that the above questions don’t directly relate to rehabilitation of disease/disorder, and are rather purely theoretical. But still, I can’t imagine having that kind of information would tangentially inform therapeutic musical neuroscience research. Anyway.

Every time we add a new element to our cake, it begins to look more like a cake. Cakeness is subjective, but few would look at Electric Counterpoint and deny that it is a piece of music, which some actually might about the Pisaro piece, and many still do with 4’33”.

Now that the overall point of this exercise is becoming more clear, and just because I love postminimalim, I’ll share some examples of postminimalist recordings below, all of which have similar properties. If you were to characterize each using the elements of music previously described, how would they differ between each work? What would be similarly or identically described?

The Chairman Dances, originally from John Adams‘ opera Nixon in China.
The popular Koyaanisqatsi, a visual tone poem collaboration between composer Philip Glass and filmmaker Godfrey Reggio.
Cruel Sister, by Bang on a Can composer Julia Wolfe.


If lazily described, much of the above music might sound identical to one another – “percussive music,” “rhythmic music,” “fast music,” “electronic music,” or “classical music” as examples. However, I hope it’s growing clear just how variable a listener’s experience of each recording might be. Koyaanisqatsi, for example, is often called minimalist because its tempo is so much slower than the other two examples. But this piece contains more musical elements than the Pisaro. It has a strong rhythmic pulse, a wide-ranging spectral character, and far more timbral information, including the game-changing element of human voices. Even without taking individual bias and tastes into consideration (like if someone happened to know and like the film, for example), the two pieces will objectively have a quite different neurological effect on the listener despite a casual description running the risk of lumping them together as the same.

In the next section, we will discuss pitch, melody, and harmony in greater depth, the latter of which will lead to the discussion of structure in music. As this will lead us squarely into the territory of folk and pop music, instead of a parting image, I leave you with a really chill remix of Reich’s Electric Counterpoint by RJD2 from the album Deadringer. Also feel free to enjoy Jonny Greenwood of Radiohead performing another version of the same piece.

Thanks for reading!

On Musical Selection in Research: 2. The Control

Did you hear that minimalist joke? It goes, “Two Irishmen walk past a bar.”

In October 1965, an art journalist named Barbara Rose published an article entitled “ABC Art” in the influential Art In America magazine. In February 2019, I won the award for opening sentence with most instances of the word “art” in it. These two facts are probably unrelated. Probably.

Her title referred to a term, predating minimalism, which described emerging post-WWII American tendencies to go “back to basics” in creative movements. How artistic movements emerge is necessarily vague, but we can trace the roots of this particular tendency as a response to abstract expressionism (Pollock, Rothko, de Kooning, Kline), as the popularization of appropriating a Japanese zen aesthetic, and as an extension of modernist reductivism. It has the distinction of being the first creative movement fueled mostly by American creators, and it influenced not only painting, but film, architecture, poetry, prose, and, of course, music.

More is less, as the saying almost goes, and artists in the fifties and sixties started to formally wonder what they could remove from their art form and still call it their art form – an increase in negative space, so to speak. This makes it a particularly good place to start for our purpose, because they were asking exactly the same question I did in part one: What ingredients can we remove from our cake recipe and still end up baking a cake?

As a disclaimer, chronological borders of musical epochs are fuzzy, and exceptions exist in all things. Examples I give in this series, as links or otherwise, are not meant to be comprehensive. They merely provide a starting reference point for those who might be interested in diving a little deeper.

Lastly, we will not be working through movements in chronological order. That’s been done to death, and it’s too hard to do unless you do it by global region one at a time. European movements will arise organically throughout the discussion, e.g. Baroque, Classical, Romantic, etc. But, mostly, we’ll focus on the last century or so, the first half of which was kind of a genre Wild West which you can read all about on Wikipedia if you’d like. To begin it all, we’ll start with a revolutionary response to the zaniness of the 1900s. It’s the silent elephant in the room, the (arguably) most well-known early experimental musical composition in history: John Cage’s 4’33”.

Silence and Minimalism

Full disclosure. My MFA is from CalArts, a school closely associated with John Cage. Further disclosure, I focused heavily on his works during my first six months of study there, culminating in a performance where I performed some part of all 90 Song Books in a single performance during a celebration of his centenary. To my knowledge, I’m the only person in history to do this, and mention it to display how interesting I find his works.

The Wikipedia article about 4’33” is quite thorough if you’d like a more detailed history of it. In this context of minimalism as an actual movement, it’s important to note that, in 1947 when 4’33” was first conceived, the term minimalist composition didn’t exist. Neither did it in 1952, when the composition and first performance by David Tudor actually happened. So, some might question the legitimacy of referring to this this piece as minimalist, and that would be a legitimate concern. However, I’d argue that like the poet Ezra Pound, John Cage happened upon the minimalist approach through his independent study of naturalism and Zen Buddhism, establishing himself as a trendsetter in what was to come. Still, even in 2019, you’ll come across spirited debates arguing whether or not 4’33” counts as a musical composition at all, as opposed to a form of experimental theater.

In all honesty, most music majors are probably rolling their eyes right now, because this everything about this piece has already been debated to death. Regardless, I believe it’s an excellent place to start on putting the cake classification system to work. So let’s look again at the elements of organized sound as previously stated.

  • Rhythm
  • Pitch
  • Harmony
  • Timbre
  • Loudness
  • Structure
  • Context

The composition for 4’33” has none of these, right? Objectively, that’s incorrect. For starters, many don’t realize the score does indeed have a structure, which looks like this:

John Cage’s 4’33” (In Proportional Notation). Full recreation here.

The lines can be interpreted to mean, “do something here,” and represent a proportion of time within the overall length. Such actions might be something like, flourish a hand, turn a page, open or close the lid, stand up, etc. The total length was in direct reference to the average length of popular radio songs at the time, a genre which Cage admittedly detested. He came up with the length of each section using chance procedures, probably something like reading yarrow stalks from the I Ching.

This means a structure certainly exists, and the piece obviously has context – a very deliberate one. It was composed with a direct, emphatic championing of Buddhist ideals, which Cage interpreted as giving more credence to quiet contemplation and stillness than did much music at the time. He also didn’t consider the piece to be silent; indeed, its purpose was to show that true silence was impossible, and that naturally occurring sounds provide their own sort of beauty:

In 1951, Cage visited the anechoic chamber at Harvard University. An anechoic chamber is a room designed in such a way that the walls, ceiling and floor absorb all sounds made in the room, rather than reflecting them as echoes. Such a chamber is also externally sound-proofed. Cage entered the chamber expecting to hear silence, but he wrote later, “I heard two sounds, one high and one low. When I described them to the engineer in charge, he informed me that the high one was my nervous system in operation, the low one my blood in circulation.” Cage had gone to a place where he expected total silence, and yet heard sound. “Until I die there will be sounds. And they will continue following my death. One need not fear about the future of music.”[ The realization as he saw it of the impossibility of silence led to the composition of 4′33″.

John Cage on 4’33”, source.

The piece asks the listener to impose order, via their attention, upon the random ambient sounds of the performance space. It’s a call to perceive the world more closely, a meditative sentiment that many artists found lacking in post-war America. 4’33” became known as a strangely hopeful act of rebellion by the mere audacity of calling itself a musical work. So, to deny its validity as a musical composition is denying the importance of context, a concept few musicians would support.

And, with a bit of a smirk, we can’t ignore that the score technically includes the element of loudness, namely, that it should have none.

The following analysis of David Tudor’s 1952 performance summarizes our conclusion:

“Cage’s piece comprised of four minutes and thirty three seconds of silence by the pianist and his instrument requires engagement of the listener’s perceptual capacities in order to recognize the work as a formal musical composition evolving in real time. Experiencing the piece in silence, the listener’s attention is focused upon the perception of all ambient sounds. This process of attending to the external environmental sonic landscape occurring within a specific time and space yields within the listener a heightened consciousness of perception per se, and in turn, a consciousness of the self as perceiver.”

Biofeedback and the arts: listening as experimental practice, Valdes, Thurtle, 2005

With a notated form and length, and perhaps one of the most influential contextual statements in the last century, we must consider the piece a musical composition. It is not just the simple score that makes the piece. It is the idea, the man who had it, and the time period in which it was conceived and performed that provides staying power. This is essential to note, because 4’33” was by no means the first or only “silent” musical composition. Thus, in this one particular instance, the power of its context overtakes the lack of other musical elements shared by other silent works. Examples of such contextual importance are actually quite common in the art world, everything from Stravinsky’s Rite of Spring, to the punk movement, to Leonardo’s Mona Lisa, to Duchaump’s Fountain.

Or Duchamp’s Mona Lisa, while we’re at it.

I went into this in such detail because 4’33” is one of the best known examples of removing almost everything about sound organization, while still resulting in a viable musical composition. All of the above has been stated before, but my goal is to talk about it with regards to the brain. While it might seem that ambient-sound-focused meditation imposed by a chance-procedures composer is a terrible place to start for neurological music research, it’s actually perfect for a kind of hilarious equivalent. It’s a great conversation about context in music, and it also, quite literally, points out an issue plaguing music research today.

Silence and the Brain

The scientific control is designed to minimize all other variables in an experiment except for the one under scrutiny. For example, the placebo group for a study on new medication is the control group. An empty platter is the control state for a study that asks the question, “What happens when I combine certain ratios of ingredients in a certain way at a certain heat for a certain amount of time?” If a cake ends up on the platter, we’ve got a great and hopefully delicious result.

Science cake. Hooray!

The control in a sound-attentive experiment presents difficulties for the same phenomenon upon which Cage’s 4’33” shines a light. It is ridiculously difficult to separate sound experience from any experience at all.

“Human audition involves the perception of hundreds of thousands of bits of information received each second. Since one doesn’t have ear-lids one continues to hear sound even when asleep.”

Ferrington 1994, Schwartz 1973

We have no eyelid equivalent for our ears. Even as Cage noted in the anechoic chamber, perfect silence is impossible for non-deaf humans. Because we are always hearing, we run into the issue of auditory attention. Take this refutation of the so-called and oft-derided “Mozart effect” by the Chanda/Levitin paper referenced in part one:

“We also note that the studies reviewed here nearly always lack a suitable control for the music condition to match levels of arousal, attentional engagement, mood state modification, or emotional qualities. In other words, a parsimonious null hypothesis would be that any observed effects are not unique to music, but would be obtained with any number of stimuli that provide stimulation along these axes. Indeed, this was found to be the case with the so-called Mozart effect, which purported to show that intelligence increases after listening to music. The ‘control’ condition in the original study was for subjects to do absolutely nothing. The Mozart effect disappears, however, when control participants are given something to do, virtually anything at all.”

“The Neurochemistry of Music, Chanda, Levitin, 2013

Another way of describing this effect: Bored brain bad, stimulated brain good. Boredom here is defined as a state of relatively low arousal and
dissatisfaction, which is attributed to an inadequately stimulating situation [1].

A dishearteningly large number of studies involving music could probably achieve similar results with any number of nonmusical stimuli – reading a magazine, watching a movie scene with only dialogue, talking with friends, even mindful meditation. The reason for this is that all forms of stimulus/nonstimulus experiences fall within some spectrum of attention. And that relates to a wide-ranging system in the human brain, known to absolutely every conscious person, even children – the cycle of anticipation and reward.

The full fancy term for this is the mesolimbic dopamine reward cycle. Dopamine itself isn’t a “pleasure” neurochemical; rather, it regulates the perpetual release and reuptake of natural serotonin and opiods in the body in a process called dopaminergic modulation. Lots of fascinating research exists about this, which I’ll mostly skip due to length constraints. But here’s what it boils down to:

The disgust humans feel in response to boredom is universal, as is our appetite for relief from it.

I don’t use the term “disgust” lightly. Take this quote from Peter Toohey’s book Boredom: A Lively History:

Robert Plutchik, writing before this study, maintained that an emotion such as boredom emerges as a derivative or adaptation of this primary emotion of [disease-related] disgust. It serves, in his view, the same adaptive function, though in a milder or more inward-turning manner, as disgust. If disgust protects humans from infection, boredom may protect them from ‘infectious’ social situations: those that are confined, predictable, too samey for one’s sanity. If all of this is true, then it might follow that boredom, like disgust, is good for you – I mean good for your health. Both emotions are evolved responses that protect from ‘disease or harm’.”

Peter Toohey, Boredom: A Lively History, 2011

“Appetite” is also used purposefully. It refers to the technical term describing one part of the human reward system. The appetitive state is the anticipation of a learned pleasing stimulus, leading to future reinforcement of the pleasure response, leading to increased goal-oriented behavior. This is well-studied, with particular regard to the association between anticipation in our most advanced brain structures, the cortex, and our deepest structure, primal, ancient, goal-oriented regions, as thoughtfully explained by Dr. Robert Zatorre.

Because of this highly integrated neural mechanism, music is excellent at provoking a strong reward response.

The learned aspect of this system leads to further complications in music research, meaning unfamiliar music tends to provoke weaker responses, while familiar or nostalgic music and genres provoke heightened positive arousal.

Consider that, without proper controls, the concept of stimulus itself is problematic. Imagine a bored test subject who suddenly hears a bit of music. How would the researcher know for certain if a study’s resulting data relates to the type of distraction involved, or simply to the mechanism of distraction itself? And again, if the music is relaxing, would the results relate to the effects of relaxing music… or any type of relaxing activity? The research is shockingly inept at making this seemingly simple distinction.

John Cage’s silent piece calls for the audience to perform an act synonymous with sound-oriented mindful meditation, often used as a method for relaxation. Meditation activates the autonomic nervous system centers for attention and control. It can be used to relieve negative feelings such as stress and anxiety by guided attention, whether that attentiveness is directed at one’s breathing, some outside stimuli, or an abstract concept. I point this out to show that, in a purely contextual composition, with nothing but a bit of visual theatrics and zero notated pitches, rhythms, or harmonies, Cage’s piece might produce a result that looks the same as so-called studies regarding the therapeutic properties of “relaxing music.”

Your next logical question would be, “How much evidence is there to show that relaxing music, in a controlled environment, has a relaxing effect on the human brain?”

The answer may shock you! Click here to learn more!

Just kidding.

The answer is none. Science has never once produced a reliable study that convincingly proves relaxing music, in and of and by itself, reduces stress.

On “Relaxing” Music

In all my reading for this series, one musician-turned-researcher named Michael Thaut consistently stands out with well-considered and controlled studies. In his study examining stress-levels of students using experimenter-selected music, subject-selected music, and absence of music, the resulting data showed that test subjects achieved relaxation responses in all three categories, even during the silence control session:

“Results indicated that… significant relaxation responses were achieved by the subjects in all three experimental conditions. Neither the presence/absence of music nor the choice of music appeared to make a difference in the relaxation response. The MAACL revealed that depression scores did not change under any of the three conditions, while all subjects reduced their hostility scores regardless of condition.”

The Influence of Subject-Selected versus Experimenter-Chosen Music on Affect, Anxiety, and Relaxation, Thaut, Davis, 1993

You’ll find a similar result in this 1997 study as well, and this one from 2000, and so on. This has several troubling implications, including that, if a study only uses silence as its control when discussing music, they might accidentally be studying boredom, attention, and distractability instead of the effects of music itself.

Which puts silent meditation, and thus Cage’s 4’33”, in a bit of a “more music than music” situation, at least when it comes to relaxation. Soft, quiet, low-key music can serve to fill in that last bit of background neural ambience without actually being distracting itself – sort of one step removed from leaving the fan on to help with sleep. We can refer to Barry Blesser’s loudness article from part one for more about how this works. We can close our eyes to filter visual cues, but we must create some sort of low-level stimulus to filter out aural cues from our environment. Blesser’s concept of “neural spaces” works whether in the extreme or not; i.e., we don’t have to be at a rock concert to observe that white noise effectively helps tune out distracting noises.

We are quite good at tuning various neural spaces to our tastes. While death metal yoga sessions do exist, concept is considered humorous at best, proving the point by deliberate contrast.

It does look pretty fun, though.

Important: I am in no way asserting that quiet, minimal, or ambient music is identical to white noise or “lesser” music. Indeed, it takes great skill to organize sound in such a minimalistic way while still achieving its intention. I am, however, saying that the relaxative properties of such music exists in the same wheelhouse as randomized background noise, exactly as Cage asserted.

Contemporary minimal music is still meditative and often theatrical in presentation, like this lovely piece by Michael Pisaro:

Asleep, Desert, Choir, Agnes, 2016, by Michael Pisaro with Dog Star Orchestra

This particular work is highly textural. It depends largely on presenting interesting timbres in a slow, steady progression. It has neither clear melodies nor any discernible pulse. Its visual presentation lends itself to the experience both in the arrangement of the performers and the interesting amplified items – many of which are naturalistic. The paper score is visual, meaning it doesn’t have a staff, key, time signature, and little to no musical notation. Rather, the performers use timepieces to cue certain actions. A vast number of contemporary compositions are written like this. This work does contain a slow harmonic progression, provided mostly by Pisaro’s electric guitar over the course of 30 minutes.

Taking all this information, we can rearrange it more formally using our musical elements:

Rhythm. Arrhythmic, pulseless, slow.

Melody. Nonmelodic, no individual pitch content.

Harmony. Harmony is present in clean electric guitar sound. Moves very slowly. Other sounds occur without relation to an overall harmonic structure.

Timbre. Richly present. While traditional instruments are used, many employ extended techniques or are not played traditionally. Electronics are used only for amplification. No human voice. No computer-generated sounds.

Dynamic. Extremely quiet.

Structure. Present but vague, slow, indiscernible. The recording is over 30 minutes long.

Context. Live acoustic performance. Naturalistic, contemplative, meditative, esoteric, academic.

The above exercise is in no way meant to summarize the experience of listening to the work, nor should it become some nightmarish form of the Pritchard Scale. I merely suggest a discrete method for classification of a musical work in a research context. Such rigorousness is, in my opinion, severely lacking in modern research publications.

Whether you’re an experienced musician or an amusiac, I’m not trying to say this is how you should listen to music. But if you’re a researcher, I hope you’ll consider this as a way to quickly categorize and classify music used in research environments. It can help achieve rigorously controlled results that are more useful and consistent for later studies. And if you’re not a researcher but enjoy this exercise anyway, I hope it helps enrich your listening experience in some way, just like the previously referenced Feynman’s flower.

Describing the work by Michael Pisaro using the above description of musical elements, consider the implications including such a piece in a research environment. A possible effective control might be listening to ambient forest sounds or soft rain. An interesting question would be whether it might, for example, make extended meditation easier or more palatable as opposed to more generic synth pads, sitar music, etc., especially compared to the tastes and experience of individual test subjects.

In Summary

This won’t be the last time I mention minimalism in this series! The Cage piece is simply used to present a sort of analytical control group, the absolute default, the literal minimum by which we can define the term “music.” I also needed to establish an example of contemporary minimalism with the Pisaro piece, because the next part of this series will introduce rhythm into our musical cake. And that means, among other things, post-minimalism. Hooray!

Minimalist rock. Get it?
Thanks for reading!

On Musical Selection in Research: 1. Baking a Cake

Did you hear about the concert pianist who ate her own sheet music? She thought it was a piece of cake.

I was at a dance club in Santa Monica with my friend Lauren, who is not a musician. She had just sat in on an electronic music production session, which had left a sour taste in her mouth.

“Why didn’t you like it?” I asked, probably between sips of Guinness because this was ten years ago.

“They were making this track that sounded great,” she answered, “but then they got stuck because they ‘couldn’t find the hook’.”

“And that bothered you?”

“Yeah,” she said, frowning. “I don’t like thinking about music like that. It’s so formulaic. You know what I mean?”

A part of me will always agree with her sentiment, no matter how much school I attend. But the producers she was with weren’t in the wrong, either. Making music is easy, except for when it’s hard, and you can never predict what sort of day it’s going to be. Knowing the formula is important for days when the inspiration juice runs low. Lucky for the conversation, our mutual friend DJ Maggie had just started as the opening act, and she had selected an ambient set to get us warmed up for the impending bigger, sicker beats.

“I do know what you mean,” I said to Lauren. “But also… Take this set Maggie’s playing. It feels free and floaty, right? Don’t you think it makes for a great opening set?”

Lauren warmly agreed.

“This music doesn’t have a hook, either,” I said, exactly as clearly and succinctly as I’ve recalled it now. “I don’t believe music always needs a hook. But if you don’t have something repetitive and catchy, it serves a different purpose.”

Lauren frowned again. “But there’s nothing wrong with that.”

“Absolutely not. You’re just making a different type of music. It’s like baking a cake,” I said, coming up with a metaphor I’ve used in countless conversations ever since. “The standard cake recipe is butter, sugar, eggs, and flour. Thing is, you can bake a cake out of all sorts of things that aren’t those four ingredients, but the more you subtract and substitute and add, the less the final product looks and tastes like a cake.”

I call it, “A Waltz in Pound Cake.”

Because, just like cake, there are four standard ingredients of music: rhythm, pitch, harmony, and timbre. There are a few secondary ones as well, such as loudness, context, and structure. In my opinion, composers and songwriters struggle the most with structure after learning the basics, which is probably why form tends to be so rigid in a given era. Learning and following such preordained structures can bother people and composers a lot, for a lot of reasons. In Lauren’s case, it’s because reducing music to its elements necessarily takes away the magic. They believe (or worry) that it will subtract from the experience of music.

Here’s a Richard Feynman quote about that.

“I have a friend who’s an artist and has sometimes taken a view which I don’t agree with very well. You hold up a flower and say, “Look how beautiful it is!” And I’ll agree. And he says, “You see, I as an artist, can see how beautiful this is, but you as a scientist take this all apart and it becomes this dull thing.” And I think he’s kind of nutty… Science knowledge only adds to the excitement, the mystery, and the awe of a flower. It only adds! I don’t understand how it subtracts.”

Feynman, 1981 interview for his book, The Pleasure of Finding Things Out.

Your tastes change. Do bakers enjoy cake less the more they learn? Based on the greatest show of all time, The Great British Bake-Off, I’m going to say, absolutely. Do comedians still laugh at jokes? Yes, they just laugh at less of them. Learning the formula changes tastes, but only enriches enjoyment when it happens. Case in point, Lauren gleefully exclaimed that she got it, that the cake thing helped a lot, and then we danced our butts off because that’s what friends do at dance clubs in Santa Monica.

In this series of posts, I will first define the building blocks of human-organized sound. This has been done many times before, but I hope to distinguish myself by taking a neurological, evidence-based approach to the subject. With modern advances in EEG, PET, and fMRI scanning, scientists are making great strides in mapping what regions of the brain deal with perceiving and performing various musical tasks. Then, we’ll look at past time periods and their associated musical epochs, and hopefully show which part of the brain composers were targeting during these movements. Some of the styles will include serialism, impressionism, minimalism, ambient, noise, drone, algorithmic, electronic beat, and postminimalism. Maybe also a bit of the ol’ Baroque, Classic, Romantic stuff too, though probably not as much, since that’s been worked to death already.

The purpose of this is mostly to piss off some really close friends who hate analysis, for my own sake as I consider the possibility of doctoral studies, and also to determine possible paths toward reproducible results for various neurological music therapy applications.

An inexhaustive list of words often used to define the elements of music:

  1. Pulse, feel, meter, rhythm
  2. Frequency, tone, pitch, melody
  3. Chord, scale, key, harmony
  4. Timbre, texture, color
  5. Miscellany – Context, Loudness, Structure

Because you have to know the rules before you break them! Actually, you can break all sorts of rules without knowing them. It’s the easiest way to break a rule, to be honest. As a final note, I’ll be using Eurotraditional to refer to the standard classical and chamber music repertoire from Europe, as opposed to Western, with an emphatic shoutout to the brilliant Natalie Wynn’s for her insightful, entertaining, mildly NSFWish ContraPoints video presentation regarding the term.

1. Pulse, Feel, Meter, Rhythm

“It’s interesting. I’ve known quite a few good athletes that can’t begin to play a beat on the drum set. Most team sport is about the smooth fluidity of hand-eye coordination and physical grace, where drumming is much more about splitting all those things up.”

Neil Peart, author and former drummer of Rush

In musical parlance, beat has many definitions. If you ask, “What’s the beat of this song?” you’re asking about the tempo, a.k.a. Beats Per Minute (BPM). If you say, “I love this beat,” you’re referring to the rhythm of around 1 to 4 bars of a song, which will generally include 4-16 individual beats. If you say, “I love Beat It,” you’re referring to a popular Michael Jackson song. See? It’s complicated.

To avoid this, the basic unit of a tempo will be called a pulse in this post. For example, the song “Beat It” has a pretty fast pulse at 138 BPM.

Feel is the basic of unit of rhythm, comprised entirely of two simple numbers: 2 and 3, a.k.a. duple and triple meters. “Beat It” is in duple, because the strong-weak cycle lasts two eighth notes instead of three. Making matters worse, and by that I mean way better, you can play both at the same time. Every hip hop song on the planet seems to do this right now, using vocal triplet structures to add rhythmic syncopation over the classic 4/4 meter. OutKast uses triple meters in their beats fairly often, in the decades-long tradition of Southern Swing, which is in 6/8 or 12/8 meter.

And it sounds GREAT.

Meter is the hierarchical grouping of pulses based on feel, such as the most common one in Eurotraditional music, the trusty, endlessly symmetric duple 4/4. That’s four quarter note downbeats to a measure, or bar. This definition also covers Indonesian last-beats, but doesn’t quite cover West African bell patterns. Patterns itself best describes that music, which will have a pulse, but little or no true meter.

A bar or measure can best be defined as full units of beat patterns, for example a couple bars that go something like, kick SNARE kick CLAP kick SNARE hat hat kickCLAP. That was “Beat It,” in case that wasn’t clear.

Finally, for defining the umbrella term rhythm, I will use that of CalArts composer, conductor, and music theory professor Marc Lowenstein:

“Rhythm is best described as something like ‘the relative times between changes in sound.’ It might sound dorky, but the precision is important. It is not pulse dependent and is not hierarchically important. All three of those terms do inform each other, but they are distinct.”

Marc Lowenstein

In other words, one of the ways humans organize sound is via time, and rhythm is the perception of those changes relative to each other. Thus, rhythmicity separates music from most not-music for human beings. There are no records whatsoever of folk music without a pulse arranged by regular strong and weak beats. Even modern attempts at pulseless music generally still have an implied one somewhere, though whether it’s perceived as one depends on a lot of things.

So it comes as no surprise that playing, listening to, and especially dancing to drums lights up more areas of the brain than any of the other three major music ingredients. In a rock recording studio, the drums get more attention and microphones than any other instrument, and roughly half of the available decibels in a standard mix go toward the drums, while the rest of the band gets shoehorned into the remaining half.

Our brains don’t just like rhythm. They are rhythmic. Brainwaves are real, they can be trained, and are constantly manipulated by all sorts of practices from the new age to the mundane. Most of our motor functions happen in time to rhythmic brain pulses, an incredible phenomenon called motor resonance. Get this: an adult human walks on average at a pace of about 2 hertz, or two steps per second. If you divide 2 Hz into four beats, you get a pulse of 120 beats per minute, which is such a common musical tempo that most audio editing software has it set as default (Ableton Live, for example). It’s also the most common march tempo. You’ll find some ratio of 2 hertz in impromptu finger tapping and a long list of routine motor actions (Leman 2016, shoutouts to JS Matthis 2013). Neurologically, it’s the default tempo because it’s roughly in the middle of the spectrum that humans discern as rhythmic vs. arhythmic, i.e. too fast or slow to perceive as a beat. Too slow and it becomes a bar, too fast and it becomes a tone.

For a long time, humans couldn’t imagine music without pulse that had a duple or triple feel arranged in a predictable pattern. It’s still pretty hard to pull off, for the sole reason that humans aggressively impose patterns on literally everything. But contemporary composers will attempt works that remove rhythm or make it unpredictable, with varying levels of success.

The reason it’s so hard to do away with rhythm is movement. When the average person’s brain hears a strong backbeat, such as in “Beat It,” the motor control regions of the brain are activated. It feels like moving even when the listener remains still. When listening to steadily pulsing rhythms, neurons in the brainstem (and elsewhere) start firing in time to the beat (Thaut, 2006, 2014). In short, the brain dances, and synchronizing our bodies with it is a very pleasant practice.

The human brain, constantly.

A few examples of music that might ignore or de-emphasize rhythmicity are drone, ambient, noise, and minimalism. You’ll also find music that has rhythmic elements but no meter, essentially changing at a pace that prevents the perception of an established pattern or pulse. That will often involve electronics, especially algorithmic music or modular synthesis, though improvisational styles attempt to achieve this as well. As previously stated, West African drumming often has multilayered patterns, but no true meter – only a feel in duple or triple time. There are endless examples of solo music in which the performer employs wildly variable tempi to manipulate the tension-release cycle. This is often called expressive or subjective timing, examples of which are found in this expertly visualized score scroller for Bach’s Partita No. 2 for Solo Violin, in this incredible oud solo from Farid El-Atrache, this performance by Emmanuel Pahud of Debussy’s Syrinx, or this performance by Tom Jenkinson on solo bass.

Later, we’ll go more in depth regarding exceptions and substitutions, but first, we’ll need to define the remainder of our terms. In other words, we need to learn how to bake a standard cake before we venture into vegan gluten-free low-carb quantum cake territory.

2. Frequency, Tone, Pitch, Melody

“Music creates order out of chaos: for rhythm imposes unanimity upon the divergent, melody imposes continuity upon the disjointed, and harmony imposes compatibility upon the incongruous.”

Yehudi Menuhin, violinist and conductor, Symboles Dans la Vie Et Dans L’art, Leit, Whalley, 1987

In the beginning, there was a sine wave, and it looked like this:

*Waves* Hiiii!

All you need to know is this: When the bump in the red line is up, the particles in a medium are dense, all smushed together. When the bump points down, they’re sparse and spread apart. Do this enough times, and you build a universe.

The sine wave is simple, beautiful, perfect math. Rotate it and you can draw circles or spirals. It looks cool in 3D, too. Pass it through the electromagnetic field, and you get pretty colors and X-rays and stuff, also known as light. Stack them on top of each other and you get neat shapes like triangles, sawtooths, and squares!

Maybe the coolest GIF about Fourier Transforms ever.

And if you pass some sinewaves through air or wood or brass or taut animal skins or whatever, you get sounds.

Just like we can’t see X-rays or radio waves, we can’t hear every frequency. Which ones you can hear (if any) is up to genetics, hearing damage, and age, since as you get older, you hear less high-pitched tones. But, generally, you start out hearing around 20 Hz to 20,000 Hz, which completely coincidentally is wavelengths of about 17 meters to 17 millimeters long. All of which leads us to the question, “If a tree falls in the woods, and it only emits frequencies below 15 Hz, does it make a sound?”

The trees have answered.

To put it simply, sine waves are everything. Detecting them is how we perceive the world around us, but also how we respond to said stimuli. This exchange is at the heart of musical interaction. We essentially turned framework of existence into a really cute game, by generating tones in rapid succession at mathematically related frequencies. Although, usually, we just call them melodies.

We don’t know what came first, singing or talking, but we know they went hand-in-hand during hominid evolution. The first instruments were probably drums or xylophones lost to time, but the earliest evidence of artistic expression are flutes made from bones and horns. Flutes, incidentally, make the closest approximation to a clean sine wave out of all symphonic instruments. Our discovery of such instruments predates any known early visual art by thousands of years (Wallin et al 2001).

Getting back to what a tone, or note, is, below you’ll find a picture of the Overtone Series (or Harmonic Series). You can imagine this picture as a series of combined mathematic concepts, but you can also imagine it as a series of pictures of vibrating guitar strings, violin strings, etc.

The Harmonic Overtone Series

The 0-1 is the fundamental tone, i.e. the lowest pitched tone, i.e. a representation of a perfectly vibrating string without any nodes held down. The successive ratios refer to how the soundwave is allowed to vibrate, in other words, math, which I won’t get into here. But if you’d like to know the basics, I cannot recommend the following 10-minute video enough:

The harmonic series does a lot more than help us arrive at pitch, melody, and scale, because it’s also a model for how we perceive harmonies and timbres. In a massively oversimplified way, the more complex the ratio of a partial from the series is, the more dissonant we perceive a tone in a scale or chord.

The simplest partials, i.e. the first few in the series, make up the pentatonic, or five-note, scale, which is practically second nature to human musical perception. If an untrained musician is asked to improvise a melody, it will usually be pentatonic if it’s in tune. Almost all humans internalize pentatonic melodies quite easily.

We’re coded, at least as infants, to prefer higher-register melodies (Trainor 1998). Voices perceived as another gender than the listener garners more attention (Junger 2013). There’s not a lot of existing research about voice and gender and register, sadly/weirdly. But what we have shows that melodies must generally exist in the human vocal range, thus helping humanize a passage of music, essentially tying together all the other elements.

Singers, especially untrained ones, often inadvertently damage their voices over time (or sometimes on purpose) just to achieve a signature “sound.” This is vocal timbre is so important to us on many levels of social interaction. Song creators capitalize on melodic attention by writing the catchiest melody they can think of and playing it repeatedly throughout a song. Quite often, they’ll also try to get a woman to sing their melody, often in a sexually enticing way to further provoke a memorable reaction in the listener. These methods of neural manipulation can be quite lucrative, which is part of the formula against which my friend Lauren initially rebelled. Sexualization in popular music is an ongoing saga fraught with both empowerment and exploitation that is the subject of many books and papers, but Women and Popular Music by Sheila Whiteley may be a good place to start.

Ravel’s Boléro notwithstanding.

Likely because of it’s ubiquitousness, examples of music attempting to avoid folky melodies are vast – noise music, ambient drones or chords, and minimalism, for example. Serialist works almost always attempt to subvert melody by using a rigid compositional process that produces difficult-to-sing melodies, to the great frustration and/or excitement of contemporary voice students everywhere. Although the reality is that most students can sing by their sophomore year what seasoned professionals used to call impossible. Which is a kind of human progress that could be called something like expertise normalization, but apparently isn’t. If you know the technical term for this, please let me know, because as a concept it’s apparently very hard to Google search.

Anyway, to sum up, humans are annoyingly adept at singing along to pretty much anything.

3. Chord, Scale, Key, Harmony

“How sweet the moonlight sleeps upon this bank!
Here we will sit, and let the sounds of music
Creep in our ears; soft stillness, and the night
Become the touches of sweet harmony.”

William Shakespeare, The Merchant of Venice

Harmony is, objectively, the hardest element to define out of these four ingredients, because… Well, here’s a pretty good attempt from William Malm’s book Music Cultures of the Pacific, the Near East, and Asia:

“Harmony considers the process by which the composition of individual sounds, or superpositions of sounds, is analyzed by hearing. Usually, this means simultaneously occurring frequencies, pitches (tones, notes), or chords.”

Almost anyone can recall a melody or make one up on the spot. If you ask someone to recreate their favorite multiple-bar drum solo, many will fail, but some, at least, will succeed. However, asking someone to recite their favorite harmonic resolution means they’ll actually need to sing a melody implying a harmony (unless you’re Anna-Maria Hefele singing with overtones, but that’s another story).

In other words, “harmony” can take on an almost mystical quality. We can help standardize the definition with a distinction: harmony is a perceived simultaneous combination of pitches, while an implied harmony is the type that arises from an unaccompanied melody or similar. Together, these terms summarize our perception or understanding of an arrangement of harmonic consonance and dissonance, i.e. the tension and release cycle of notes and chords.

Harmonics can refer to all sorts of things, including suppressing the fundamental note to get a higher-pitched tone from an instrument, or it can generally mean multiple waveforms/oscillations/whatever occurring simultaneously in a scientific context. The musical meaning predates the scientific one.

When defined as a basic musical element, harmony isn’t specifically the bassline or top notes, and it’s not the chords played on keys by the left hand or strummed on an acoustic guitar. That’s the accompaniment implying a harmonic progression. If I sing an unaccompanied melody, no one’s playing any chords, but the tune can still be analyzed harmonically. An organ fugue rarely contains big block chords, but it will still move through harmonic regions that arise from accidentals, the interlocking melodies, and pedal tones. So that’s why chords are not the same thing as harmony.

A key is a mode or scale with one pitch made hierarchically the most-stable by an asymmetrical, contextual process. In Eurotraditional music, it is a shorthand indicating what notes, accidentals, and chords to play. The “most-stable” pitch often refers to the tonic, such as the note C in a C major scale. Some alternate forms of organizing tonal content are ajnas, maqams, and dastgah from the Middle East, thaat in Indian raga, pathet from Indonesian gamelan, and so on. Scale, key, and mode are often used interchangeably in English, but this can get murky when you consider international interpretations of musical pitch organization or performance environments. That said, here are iterative definitions for each:

  • Mode – a limited collection of discrete pitches.
  • Scale – an ordered mode.
  • Key – a mode or scale with a hierarchically most-stable pitch.

Plenty of compositions lack a traditional harmonic structure, but the cycle of tension and release is so inherent to humans that it’s quite difficult to avoid entirely. For example, many works don’t use easily analyzed chords, but tone clusters – like those created when mashing your hands on a piano – still contain harmonic information, even if it’s just that one hand’s cluster is generally higher pitched than the other. That might not sound so great, though, so composers have spent a lot of time in the last century creating complex organized frameworks that defy classic tonal analysis. It’s almost more common in academic music to reject key signatures, a practice referred to with the umbrella term atonality. But even those pieces often contain moments upon which the listener might impose a harmonically interpreted tension-release cycle.

In short, harmony – implied or not – just happens, whether you like it or not.

4. Timbre, texture, color

“I sometimes wish taste wasn’t ever an issue, and the sounds of instruments or synths could be judged solely on their colour and timbre. Judged by what it did to your ears, rather than what its historical use reminds you of.”

Jonny Greenwood, composer, guitarist for Radiohead

Firstly, it rhymes with amber. Secondly, the story of music performance is timbre. Thirdly, the story of recorded music is even more timbre.

Here were the available musical timbres for a few million years:

  • Blowing through a wooden tube
  • Blowing through a wooden tube over a wooden reed
  • Blowing through a metal tube
  • Sticking giant metal tubes in a wall and blowing through those
  • Blowing through a meat tube over a meat reed, also called singing
  • Plucking taut strings over some gourd-shaped cavity
  • Sawing taut strings over more taut strings over a gourd-shaped cavity
  • Making taut animal skins resonate over some gourd-shaped cavity
  • Hitting animal skins with sticks
  • Hitting sticks with other, more different sticks
  • And, of course, this thing:
I mean, there are always exceptions.

“Instruments possess different timbres” is a fancy way of saying a trumpet doesn’t sound like a kettle drum. Different resonator shape and material means different partials in the overtone series get emphasized. That’s largely where the signature sound of a class of musical instruments comes from.

Timbre isn’t the difference between high and low notes. That’s pitch, i.e. tonality. Timbre is the quality of the sound. If I sing “ah” with my mouth wide open and a lot of breath support, that’s a very round, smooth timbre. If I sing through my nose and do my best Munchkins of Oz impression, that’s a sharp, nasal timbre. The “ah” color is going to look more like a sine wave, while the sharper timbre is literally a sharper shape, as shown the comparison of three instruments’ soundwaves linked above. Er, and also linked just now.

This is a good place to mention that, in the neural realm of sound, the human vocal timbre is quite special. We have specific regions and processes dedicated to identifying the source of a speaker or singer, including their gender, age, emotional state, status, personality, arousal, and attractiveness. [1][2][3][4][5][6][7]. This builds directly upon the discussion about melody and pitch perception. In other words, the timbre of the singer’s voice can greatly influence how the pitch content is received, providing context via color and personality.

When it comes to musical timbre, everything changed when the Fire Nation attacked, and by “the Fire Nation” I mean electricity. Electric amplification and recording allowed humans to arrange sound in ways never before possible. Can you guess what a standard electric current looks like? It starts in “S” and ends in “ine wave.”

Electricity travels via waveform, just like music. Marrying electricity and sound was so obvious, it’s a little weird it took us a few eons to figure it out. But it allowed us to invent multitudes of new sounds, not to mention effects such as amplification, distortion, compression, equalization, chorus, phasers, flangers, delay, and, eventually, synthesizers. This last example allows us to shape the electricity and see what sounds come out, which is the opposite of the other, SUPER BORING way of turning a tree into a tube and blowing through it and hoping it sounds nice.

Digital synthesis (as opposed to analog synthesis) came with the advent of computers, and is a pure version of sound modeling where things like voltage and the size of the vacuum tube don’t matter. You just punch in numbers and they come out in a very, very rapid stream, mimicking the shape that we already had in much higher fidelity in vinyl grooves. Just kidding, 320 kbps mp3s sound amazing, fight me.

All this computing power, and we still can’t synthesize anywhere near the beauty of discarded old salmon traps in Alaska.

Timbre possibilities are now so numerous as to be near endless. While it’s difficult to make music without harmony, it can sort of be done, for example this seminal work entitled ten hours of white static. That was sarcasm. Or was it?

It is semantically impossible to experience music without its timbre because we define it as an inherent property of sound. The closest you can come is constructing the most boring timbre possible, like the sine wave, or in a sense using just intonation techniques to prevent “harmonic beating”, although this simply produces a straight-toned timbre instead. When we (eventually, I promise) move on to Part Two, we will look more closely at new timbres that arose recently and how they affect the modern brain.

5. Context, Loudness, Structure

The context of music is something contemporary composers often obsess over. It refers to the associations experienced by the listener alongside the auditory content. Instrumentation, rhythms, harmonies, and song structure vary wildly by culture, meaning if you didn’t grow up with a genre, it’s harder to predict, which may feel unpleasant to the listener. A listener’s background will strongly affect their perception of emotion in a musical stimulus. Music is an especially communal experience, making it strongly associated with the average person’s sense of identity. Finally, lyrics in one’s native language can relate to personal experience or emotion, with the added bonus that they’ll be in the human vocal range, making lyrics highly desirable neurological fodder.

Loudness, volume, or dynamic is one of the least understood neurological elements of music today. We have it mapped out pretty well, but still have no idea why humans like some loud noises, but not others. We don’t know why increasing a song’s volume sounds better until it doesn’t, causing pain instead. We don’t know why we can stand louder volumes at a rock concert than we can at home or in the car. We don’t know why swelling or subsiding volume affects us so strongly that it can induce chills (especially when coupled with textural changes and more surprising harmonies). There’s one study I know of that attempts to explain our taste for certain loud experiences, essentially using volume to distract from other stimuli the same way a psychotropic drug might.

A graph that goes from “Huh?” to “Ow.”

The reality is, volume is still a mysterious beast. Suffice to say, we do tend to turn up music we like, even when doing so makes no difference in discerning detail of the song – or has any other practical effect besides making more neurons firing. We do this even if it hurts a little, or if we know we’re damaging our ears, because in the moment it feels good. The so-called “folk science” of it is popularly stated, but any evidence supporting a singular theory is sketchy or speculative at best. It’s also just tough to research, since testing loud volumes on your subjects tends to reduce the number of subjects who can hear your subsequent lab tests.

Song structure, or form, is determined by the combination of meter and harmonic progression, and to a lesser extent the regular return of thematic hooks, or ostinati when being fancy. Classical and chamber music has a wide variety of forms, while many contemporary works strive to be formless, so-called “through-composed.” At the same time, music that heavily rejects things like melody or rhythm might rely on structure to convey a sense of musicality. This practice of substituting the de-emphasizing of one ingredient by emphasizing another one will be at the heart of subsequent installments of this series.

In Eurotraditional classical music, forms have names like rondo, sonata, concerto, aria, fugue, and so forth. Indian classical ragas have their own forms, just like Middle Eastern maqams, and etc. Plenty of international music is structureless without needing any meta analysis necessary, like some Ghanan Ewe music or some chants from North American indigenous peoples. Structure is complicated, it grows fuzzy very quickly, and many composers struggle with it all their lives.

Pop and folk music have a rigid form known by everyone: Verse, chorus, repeat. Here are the actual building blocks of nearly every pop song in existence:

  • Intro
  • Verse
  • Pre-chorus
  • Chorus
  • (Refrain)
  • Bridge
  • (Breakdown)
  • Outro

Mix those around, repeat them however many times you want, skip some, skip most, it barely matters. As long as it has some sort of verse-chorus-verse structure, it’s probably going to get called a “song.” To be quite frank, a lot of orchestral or symphonic music uses these concepts, too, just better hidden, or in the case of arias not at all. Here’s a fun way to visualize various song structures.

Refrain is something I personally differentiate from chorus, which is based on how I learned to analyze folk music structure. The chorus can be something like a catchy short verse that repeats several times in the song. It can have a new rhyming structure than the song to help differentiate it, for example. A refrain, however, is a single repeated line or phrase. For example, the “Beat It” chorus contains: “No one wants to be defeated / Showin’ how funky and strong is your fight / It doesn’t matter who’s wrong or right,” while the repeated refrain is, “Just beat it, beat it” over and over again.

A breakdown is almost always at the end of the bridge, but it’s currently common in popular music structures, so it gets a bit of special recognition. It’s also related to breaks, like a drum solo between song sections. You’ll see such structural terms all over the lyric archive site, genius.com. These terms all relate to folk music, too. A pop song is a modern folk song, meaning its most important purpose is to achieve catchy memorability or, in modern parlance, “profit.”

Song length is a conscious choice dependent on how a piece is structurally arranged. It relies heavily on the medium. A machine or computer can perform a single piece for years, for example, while humans are constrained by such factors as physical limitations and how much they like to party. It’s also interesting to note that, as the album format dies and streaming takes over, pop song lengths grow increasingly shorter due mostly to Spotify’s monetization policy.

Through-composed music, in which no sections repeat, essentially doesn’t exist in pop. One could argue that if a piece is through-composed, it doesn’t fit the definition of a pop song, even if it happens to be by a composer generally considered a pop artist.

We’ll find that form is incredibly important, but exists on a larger scale than the brain generally pays attention to viscerally. Thus, it performs an interesting dance with the listener’s anticipation and reward cycle.

Science and the Musical Brain

Each element of music interacts with various regions of the brain, some in tandem, some separately. This in turn activates neurochemical processes closely tied to the anticipation and reward processes, regulated in large part by dopamine, opiods, and norepinephrine. We are learning that such neurochemicals serve different purposes depending on how, where, and in what amounts they are served during cognition (Chanda Levitin 2013). Such studies are still in their infancy, with new revelations occurring all the time. Mapping the interaction between the elements of music, the regions of the brain activated, and the neurochemicals involved will be the major focus of the remainder of this series.

There are quite a lot of known issue with modern music science. Music experience is highly subjective, and strongly dependent on the tastes, background, and training of the listener. The paper I cited above has this to say about it.

“The lack of standardized methods for musical stimulus selection is a common drawback in the studies we reviewed and a likely contributor to inconsistencies across studies. Music is multidimensional and researchers have categorized it by its arousal properties (relaxing/calming vs stimulating), emotional quality (happy, sad, peaceful) and structural features (e.g., tempo, tonality, pitch range, timbre, rhythmic structure). The vast majority of studies use music selected by individual experimenters based on subjective criteria, such as ‘relaxing’, ‘stimulating’, or ‘pleasant/unpleasant’, and methodological details are rarely provided regarding how such a determination was made. Furthermore, there is often a failure to establish whether the experimenters’ own subjective judgment of these musical stimuli are in accord with participants’ judgments.”

“The Neurochemistry of Music, Chanda, Levitin, 2013

As an example, one study might consider “relaxing music” to be New Age ambient music composed on a synthesizer. A different study might feature something atmospheric by Nusrat Fateh Ali Khan. Still another might use Samuel Barber’s Adagio For Strings. But if the listener dislikes New Age or Diwali music, that will negatively affect results. Also, one “New Age” track or Khan track will sound quite different from the next, adding more uncertainty to results. We don’t even know if there are no vocals, like in that first New Age track, or if there are extensive use of vocalizations, like in the Gabriel/Khan piece. And, finally, if they’ve seen the film Platoon, they’ll have a wildly different (probably depressive) reaction to Adagio For Strings than someone who considers it just a pleasant classical work for strings.

Spoiler alert, people die in a war movie.

Unfortunately, more studies than I care to count will simply say, “relaxing music,” giving no indication of what that actually means, or they may describe the music too vaguely. Same goes for “stimulating music,” which could be anything from hair metal to dubstep. This all adds up to mean scanning technology such as PET, EEG, or fMRI will read very different brain activations from one study to the next, despite the studies claiming similar situations.

Take another example, the ever-popular drum circle. It’s used in many forms of music therapy with proven positive results [1][2][3][4]. However, no current study exists (that I could find) that looks at the differences between music therapy featuring group drumming and group theater therapy. Both have similar positive effects on stress levels, negative thought loops, and addictive or destructive behavior. But a classic drum therapy session has a leader, same as theater therapy, who is guiding everyone in their actions, cracking jokes, generally easing everyone’s mood and getting people working together. So, what’s the difference between drumming and theater? Is there any such difference? Or does it simply need to be any positive group social activity? As of right now, we simply don’t know.

Finally, according to Chanda and Levitin, most studies struggle to find a meaningful and/or exhaustive control for their experiment. For example, let’s take a new form of therapy called Guided Imagery and Music (GIM) which reliably reduces stress chemicals. The main study regarding this shows that the treatment is less effective than when the images are shown in silence. However, the study fails to test with just music, to test other forms of stimuli. What if their results were simply the difference between boredom, and non-boredom? If they had been in silence, but asked to walk in a slow circle, would that have had similar results? What can music do for the brain that other forms of stimuli cannot?

This is where we’re at with music research in 2019, and I believe it comes largely from lack of nuance when approaching music stimulus, and over-emphasizing what music can and can’t do. Perhaps this happens due to lack of training on the researchers’ parts, and possibly due to the sort of pseudo-mysticism often used to describe the musical experience. That lack of specificity is what I intend to address in this series.

The Shoulders of Giant Steps

While many of the papers sourced in this and subsequent parts come from a variety of sources, special mention must go to Michael H. Thaut and Daniel Levitin, two major proponents for rigorous, evidence-based neurological music research. Without their efforts, this humble article series would not be possible. I also plan to share quite a lot of music, mostly through YouTube or Spotify. If you hear something you like, please consider purchasing their output or subscribing to their channels.

In Summary

How humans organize sound will inform our method of classification for use in a lab. We’ll see how, within historical epochs in musical (and other creative) movements, artists were targeting various brain regions, emphasizing or de-emphasizing them as a form of internal investigation. I did this first section in fairly rigorous detail, because it’s important to clearly define these ingredients before showing how composers manipulate, remove, or subvert them.

Thanks for reading! More cake to come. And the cake is not a lie.

Zaub’s Excuse Mui Soars

Zaub - Excuse MuiIt’s always hard to start these sorts of things out. Like, should I say Zaub is a band? An ensemble? A collective? I also want to avoid words like fusion and eclectic. Those are old terms that don’t really mean anything, as far as I can tell. All musicians combine past with present to find new ways to express their voice. It’s all music, period.

So, that said, I’m going to avoid trying to classify Zaub too finely. They bring a lot to the table. I’m also going to call them jazz rock, even though that kind of puts a sour taste in my mouth. Most concrete labels do. But the jazz here is so strong, and so is the rock, so that’s about as good as it’s going to get.

They have a very strong Middle Eastern flavor, brought of course to the forefront by frontman Toofun Golchin’s impressive compositions and masterful playing. His lead guitar is often answered by Yunus Iyriboz, both of whom come from a strong background of playing bluesy stuff in international modes and rhythms. Max Whipple plays a bass with a ton of strings, and Colin Kupka brings extremely strong jazz and solo chops into the mix. Finally, the infallible duo Dan Ogrodnik and Amir Oosman, who also play together in Rhein Percussion, round out the cast of Zaub’s third album, Excuse Mui.

The record begins with the sonic equivalent of grabbing someone by the collar and making them sit down next to a pair of speakers. Listen up, people, seriously! It’s a great attention-grabber, and really sets the tone for the rest of the 4-track album. You’re going to get some soft moments, but a lot of Excuse Mui really rocks. Most of the time you’ll be listening to various incarnations of Toofun’s secretly catchy themes and melodies. He’s really found common ground in both jazz and Middle Eastern styles by acknowledging the repetition of themes in rotating timbres used by both traditions. That, I think, is my favorite part of this record. Thanks in large part to smart playing by Dan, Amir, and Max in the rhythm section, the transitions between each tune’s sections/movements happen so smoothly, I guarantee you’ll be surprised at least once to find you’re in some new, soothing little musical realm with no earthly idea how you got there. This is music, after all, so who needs earthly ideas anyway? This music soars.

It’s a wonderfully consistent album in terms of quality, but you’ll hear that soaring quality particularly in solos. Collisions features a great guitar solo, and Levitation opens with really lovely world percussion from Dan, plus it has an epic return to the A section that’s maybe one of my favorite moments of the album. You can hear this is in the video linked below.

Ode to Ornette in particular has a great vibe to it, and while I wouldn’t really call it “free” jazz, it’s definitely reasonably liberated. This tune actually sounds a lot more like bebop to me, thanks in large part to Colin’s superb solo. This is one of those songs that always makes me wonder what the hell happened to epic sax solos in rock songs. A really well-done subtle outro takes us to the final track, which finishes up the album.

In the interest of musical combinatorics, Zaub successfully merges virtuosity with the rule of cool. Deceptively earworm-ish melodies and internationally inspired rhythm structures make Excuse Mui a fresh, satisfying sound. It’s well worth a listen. You can find more about Zaub in the links listed below:

iTunes: itunes.apple.com/us/album/excuse-mui-ep/id1092685596
Official Website: zaubnasty.com/#!audio/c1577
Facebook: facebook.com/ZaubNasty/
Soundcloud: soundcloud.com/zaubnasty
YouTube: youtube.com/user/ZaubNastyMusic

Thanks to Austin Antoine for You’re Very Welcome

The new album from renowned underground performer and Hodgepodge artist Austin Antoine does not disappoint. Produced, mixed and mastered by the talented Mister-he and sporting a gorgeous cover by Amy LeeYou’re Very Welcome opens with a sense of narrative foreshadowing, teasing the listener with things to come. Vocal processing, lush production, anything goes. You know we’re sonically referencing heroes here, just as the opening monologue suggests, but you’re not sure which ones. Is this Andre 3000 at his most spacey? Gambino at his most theatric? We’ll see.

Wasting no time, Antoine’s first full track is the classic banger, self aware on several levels. We discuss the artist’s place in the scene; we’re assured the artist is young and strong, connected but humble. The rhymes flow, the chorus bangs, the beat bumps. This track knows what it is, and Antoine wants everyone to make sure we know where we’re at.

With Streets of Broken Dreams, Antoine starts to throw the listener some curves. This beat swings pretty heavily, and one can almost imagine some old music-man with white gloves, a cane and skimmer hat, airing out biting sarcasm as if the suffering performer’s woes are just another part of the show.

As the album progresses, it seems like Antoine starts to introduce this crazy new idea where he’s more than “just” a rapper. He’s a performer, and he’s got the pipes to prove it. I love when an artist has put some real thought into their track order, and it’s no accident that the hints of Austin singing toward the end of the woo-me track Summer Days leads into the astoundingly soulful interlude, Kelsey. Contrapuntal, a cappella melodies sung entirely by Austin take us deeper down the rabbit hole, exploring into what hip hop means besides some dude rapping over beats. With You’re Very Welcome, Antoine is taking the listener through a lesson in first impressions. Every artist has a journey that transcends genre, and few albums I’ve heard capture that concept as well as this one.

Got a problem with his singing? Unless you’re Nas (or even then, maybe) you better get over it. That’s the message in Rahzel/Aaliyah, a raw callback track that says, who gives a shit? Austin knows where he’s coming from, and he knows he can freestyle circles around anyone who steps to him. How many rappers out there can use the words “Guinness World Record” in their list of accolades? He’s been killing it for years, but this album is a new step for Antoine. He has accomplishments to back up his confidence. Listen to POWER!!, and tell me you’ve heard anything like this before. Just like with the intro, we know this style is coming from somewhere, from Austin’s heroes, but amalgamated into that dope freshness that speaks for itself. Hell yeah he likes video games, and hell yeah he’s watched Dragonball Z, and hell yeah he can rap like a beast.

You’re Very Welcome represents the new breed of artist. We don’t know if it’s hip hop. We don’t even know if it’s a record. It’s a work done by an individual who is navigating this strange new experience of becoming a performing adult with integrity amidst peers who don’t remember a time without email. Austin has really captured a moment here, and demonstrated tremendous personal growth in a truly relevant release.

You can follow Austin on his website, SoundcloudTwitter, Facebook, and Instagram. You can find him on Hodgepodge Records, and you can pay what you want for You’re Very Welcome at his Bandcamp.

Austin Antoine

Come see my band Madapple play our debut show in Hollywood!

I am very excited to announce my amazing new band Madapple will be performing TWO SETS for our inaugural performance at the super cool Sassafras Salloon in Hollywood! The show is FREE and goes from 9-midnight, and it’s in the part of Hollywood where parking isn’t completely impossible. Hope to see you there!

Expect more updates about Madapple as we build up to a tour some time this summer. And hey, consider adding us on Facebook or Instagram. Thanks, hope to see y’all there!

The new Boss RC-505 and my cover of Swimming Pools (Drank) by Kendrick Lamar

The Boss RC-505 Loop Station is my newest purchase, funded in part by my Patreon page, which is amazing. I also have a new SD card! That’s three milestones down because of my Patrons. Woohoo! Thanks!

The RC-505 is my first real looper, as in not a delay pedal like my old also-Boss Giga Delay. It has five channels, each with a mix fader, a stop/start/rec/overdub button, and an edit button which is related to post-FX. There’s another knob for pre-FX and three memory slots for those FX, which is awesome. It has a built in customizable metronome and, best of all, PHANTOM POWER!! This means for the first time in my life I can loop using my nicer, more powerful microphones. Serious game changer, as you can see in the above video I’ve just released!

The ability to stop and start tracks is a huge deal, and really allowed me to delve into Kendrick’s song. It’s much easier now for me to cover full songs, meaning won’t have to pull a Spy Vs. Spy any more like in the BANKS video (as much fun as that was).

Swimming Pools (Drank) is really amazing. It turns the hip hop/club trope of having songs about alcohol on its head. Instead, the song is about the peer pressure to drink when you don’t want to and/or have already had too much. The production on the track is extremely lush, so you can see in the video how much I’m riding the Boss pedal’s reverb levels for different sections of the song. The track almost sounds like it’s underwater. Like in a swimming pool maybe. Do you see what they did there. DO YOU SEE WHAT THEY DID THERE.

Anyway, it’s an awesome track and the second verse is like one of my favorite things ever. Figuring out each one of these songs is always a really fun puzzle, and that’s why I love doing this. Can’t wait to see how my channel progresses now that I’m learning this new gear!

(Although, if they’re not foot pedals any more, what are we going to call them? “Hey, check out this new station I got” just sounds weird, am I right?)

My Parody and How To Avoid YouTube Content ID Detection

My most recent video “Parody” is about the frustrations of copyright trolls on YouTube. This video is now the subject of an ongoing battle with copyright trolls on YouTube. It’d be funny if it wasn’t so annoying.

Based on Pharrell’s “Happy”, I thought it would be safe under the Fair Use Act as a form of commentary on the legal hurdles of the song and any other major label tune. It turns out that’s not the case. Even though I’m complaining about the copyright of the song, monetizing the track means they can still claim the melody (I guess?) and they can take it down if they want to.

One of the most frustrating things about Content ID, which is YouTube’s content and metatag crawler, is that the appeal and dispute is set in increments of 30 days. 30 days to acknowledge, 30 days to appeal, 30 days to wait for the claimants response to your appeal. Of course they always wait as long as they can to respond, or if they know they can’t refute the dispute, they don’t ever say that. They just let it lapse, which takes, you guessed it, 30 days.

In the above video, since I’m not commenting on the song directly (I’m making fun of the dumb lines and non sequitur metaphors, but I’m not specifically stating they’re bad) and I’m not specifically refuting the heart of the work (my song’s not called “Crappy”, which would be a direct response to the theme of “Happy”), I’m probably just out of luck.

Three things you can do to avoid Content ID on YouTube

  1. Record every part yourself. If you re-record every part yourself, especially if each track is distinctly different (as they are here since I do them all vocally instead of using the same instruments as the original), Content ID will have a difficult time recognizing the similarity to the original.
  2. Don’t tag your video with identifying content. If you put the name of the artist and/or song in your video tags, Content ID will flag you even if you don’t have any content in your video. What’s hilarious about this is that, in your appeal for an erroneous identification, there is no option that simply states “This video was mistakenly flagged”. Seriously. Their system makes constant mistakes, but they refuse to admit it. Weird, huh?
  3. Don’t put the name of the artist and/or song in your video title. Same as with the tags. Since Content ID isn’t good at identifying your song if you re-record all the parts, it will use your video title to try and see if you’re using copyrighted content, even if you aren’t.

You’ll notice these tips will negatively affect your search results in a big way. This is certainly not accidental. And if you’re doing a straight cover, odds are it will still catch you on word recognition in the lyrics. But I believe the above is correct because, after watching this helpful video directed at videogame reviewers, I avoided the above three steps and had no problematic claims on my video. They want you to think their crawler is some uber-accurate identigod, but mostly it just reads whatever you type in your description, tags and titles.

It wasn’t until about a month after the initial release and my social media views were drying up that I added the tags and extra info in the title mentioning the words “Pharrell” and “Happy”, after which my video was quickly hit with three different copyright claims. I’ve successfully disputed two out of three claiming the Fair Use Act, but unfortunately EMI Music Publishing/CASH/UMPG Publishing has chosen to reject my dispute and reinstate their claim. The page in which this all happens looks like this:

I have now re-appealed my dispute, which is identical to the first one and which I adapted from the video linked earlier. It reads like this:

This recording is a parody and form of commentary and critique and uses no source material from the original recording. As an acappella recreation the audio is transformative and by sending this very notice becomes relevant to the critique of the piece and its legal status inherent in my parody. The video and audio were created entirely by myself on my own property. My parody is considered Fair Use by both YouTube and Federal Copyright Law. A Fair Use parody does not legally require the copyright holder’s permission, and I have every legal right to upload original content in the form of a parody. For further proof and information on Fair Use, please refer to: http://www.copyright.gov/legislation/dmca.pdf and additionally http://www.youtube.com/watch?v=S521VcjhvMA&t=14m16s Thank you for taking the time to verify the clips to see that my usage does not violate copyright.

Five reasons this sucks hard

  1. I will probably have to wait another 30 days (or close to it) to find out what their response is.
  2. Their response is literally at their whim. They just click whatever they want and aren’t accountable for their choice at all, even if it’s illegal.
  3. If they reject my dispute, even if their rejection is illegal, I get a strike on my YouTube account. Three strikes and they delete your account.
  4. They can get my video taken down at any time, even if that request is illegal.
  5. We are all arguing about, literally, nickels and dimes here. I make almost no money off of ad revenue because I’m just not popular enough. These arguments are entirely for the principle of the thing. I just hate to watch bullies win.

Parody is going to reach 1,000 views faster than any other video I’ve made. That isn’t much to them, but it’s huge for me. Just let me put up my video and not have to deal with these jerks who are making the most ironic copyright claim in history. Thanks for reading, and I hope you learned something. Or, if not, at least enjoy the song :)