Do Musicians Hear Speech Differently?

The Case of Speech Prosody

We often think of music as separate from language. After all, while many people can speak and understand speech fluently, some of us struggle to hear whether a singer is off key. Yet research suggests that music and language may have more in common than initially meets the ear.

Prosody is a good example. Prosody is the name linguists give to changes in pitch and rhythm in speech. For example, consider these two sentences in Figure 1 below:

Figure 1. Why prosody matters!

The two phrases are nearly identical, except for the comma. In normal conversation, we can’t see the comma. Therefore, people increase their pitch and elongate the syllable just before the comma to signal where it belongs. So if you’re Grandma in this scenario, whether you run for your life depends on how you interpret prosody.

But it turns out that we all use these cues a bit differently: some only listen for pitch, others care more about duration. Sometimes, which cue you use can mean that you hear entirely different things! But why do people differ so much? In a recent study, Symons and Tierney (2024) suspected that music might be the key. Specifically, they predicted musicians, who have learnt to pay close attention to pitch in music, might pay more attention to pitch when interpreting speech.

About the Study
To test this, the researchers used voice morphing software. This software takes two different phrases and makes new versions where pitch suggests one interpretation (e.g., “eat, Grandma”) and duration suggests another (e.g., “eat Grandma”). The researchers looked at different types of prosody: where the comma is in the sentence and which word is emphasised. In English, duration is more helpful when figuring out where the comma is while pitch is more helpful when working out which word is emphasised.

This study was conducted online, with people participating from anywhere in the world using their own equipment. Online testing is increasingly common in psychology because it allows researchers to recruit more diverse samples of people. This means their findings might be more representative of the whole population. It is also a quick and cost-effective way of recruiting people who can be hard to reach, such as trained musicians. Ultimately, 34 musicians and 43 non-musicians took part in the study.

What They Found
It turns out that the researchers were only partially correct. Musicians were not always biased towards using pitch. When pitch was useful for distinguishing which word was emphasised, musicians used pitch more. But when duration was more useful, like when distinguishing the position of the comma, musicians used duration more. This suggests that musicians were better at figuring out the best cue to focus on.

But What’s the Catch?
Although these results suggest that learning music can affect how we interpret speech, there are some important caveats to consider.

While online testing enables researchers to access a larger and more diverse participant sample (Gosling et al., 2004), it could also mean that some people completed the study in distracting environments. Subsequent research by the same researchers found that distraction can impact cue use: to cope with distraction, listeners shift towards from relying on a single cue (e.g., pitch) to using multiple cues (Symons et al., 2023). If musicians had been in quieter environments or had headphones with better noise cancellation, it might not be surprising then that musicians differed from non-musicians. Future studies could either replicate these results in-lab to ensure consistency in the participants’ auditory environment or ask online participants complete a follow-up questionnaire about their environment (Bianco & Chait, 2023).

The researchers also did not place many limits on who could take part, as long as they were native English speakers. However, prosody differs across varieties of English (British versus American, for example). These differences could create variation that has nothing to do with musical experience. Online platforms offer geographic filters that could be used to ensure all participants speak the same English variety. However, this could reduce the study’s generalisability. Instead, one option would be to look at how well these findings replicate across different varieties of English, or even different languages!

Take Home Message
While this work provides important evidence experience with music can affect the way we understand speech, we need more research to fully understand how and why music shapes our hearing.


References

Bianco, R., & Chait, M. (2023). No link between speech-in-noise perception and auditory sensory memory – Evidence from a large cohort of older and younger listeners. Trends in Hearing, 27, 23312165231190688. https://doi.org/10.1177/23312165231190688

Gosling, S. D., Vazire, S., Srivastava, S., & John, O. P. (2004). Should we trust web-based studies? A comparative analysis of six preconceptions about internet questionnaires.The American Psychologist, 59(2), 93–104. https://doi.org/10.1037/0003-066X.59.2.93

Symons, A. E., Holt, L. L., & Tierney, A. T. (2023). Informational masking influences segmental and suprasegmental speech categorization.Psychonomic Bulletin & Review, 31(2), 686-696. https://doi.org/10.3758/s13423-023-02364-5

Symons, A. E., & Tierney, A. T. (2023). Musical experience is linked to enhanced dimension-selective attention to pitch and increased primary weighting during suprasegmental categorization.Journal of Experimental Psychology: Learning, Memory, and Cognition, 50(2), 189. https://doi.org/10.1037/xlm0001217