An ongoing debate among scientists, on why chimpanzees and other nonhuman primates cannot speak or sing like humans, has focused mainly on evolutionary changes in human brain development. Attention has now expanded to anatomical changes of the voice box that may have played a role in our capacity to produce complex sounds.
A team of researchers from Japan and Europe has now revealed that evolution of the human larynx contributed to the stable voices we use to communicate. Unexpectedly, these changes do not include the addition of structures but rather the loss of specific vocal folds or cords in the larynx.
"Paradoxically, the increased complexity of human communication involved a simplification of our vocal anatomy," says lead author Takeshi Nishimura of KyotoU's Center for the Evolutionary Origins of Human Behavior, or EHUB.
Most primates have thin, ribbon-like vocal membranes rising out of their vocal folds. The loss of these air sacs seen in chimpanzees and other apes seems to have provided a stable voice quality and controllable voice pitch that we humans use when singing or speaking.
Nishimura adds, "Studies by the late Dr Sugio Hayama, on which our work was largely based, showed that evolutionary modifications in the larynx were necessary for the evolution of spoken language. We took his work to the next level, demonstrating that the simpler the vocal fold morphology, the easier it is to control its vibrations."
Senior author Tecumseh Fitch of the University of Vienna explains that the thin vocal membranes found in the larynx in the team's large selection of monkeys and apes are specific to nonhuman primates. Based on computer modeling showing how vocal membranes allow nonhuman primates to create their characteristic vocalizations, the team posits that the melodious quality of the human voice directly results from losing these membranes during evolution.
"Inside the larynx of vocalizing chimpanzees and monkeys, we see active vibrations of their vocal membranes causing loud and unstable scream-like calls," Fitch says.
According to Isao Tokuda of Ritsumeikan University, whose study of nonlinear dynamics in animal vocalizations led to his investigation of voice production in chimpanzees, the presence of vibrating tissues to the vocal folds may increase the vibrational degrees of freedom, causing frequent vocal instability.
"By avoiding this instability, humans possibly achieved stable source sounds, accelerating the evolution of human language."
Evolutionary biologist Jake Dunn at Anglia Ruskin University notes, "Using the comparative method to reconstruct our evolutionary past has shown that, if humans alone lack the vocal membranes that virtually all nonhuman primates have had as a trait, we may have lost it in our recent evolution despite sharing a common ancestor."
Austrian voice scientist and former KyotoU scholar Christian T Herbst sees the apparent tradeoff between the reduced voice-box complexity and our increased ability to create and transmit enriched verbal information as a "movement of the ability to produce complex vocal information from the throat to the brain."
Ole Næsbye Larsen at the University of Southern Denmark notes that "a comparison of extant species is often used to infer the evolution of traits, such as animal behavior, that do not leave a fossil record. Our past video recordings of how the squirrel monkey voice box works during vocalization now seem to support a hypothesis on the evolution of the human ability to speak."
Nishimura concludes, "Other changes, including those in our brains were also needed to gain language, of course, but this anatomical simplification probably accelerated the accuracy with which we sing and speak."
Takeshi Nishimura, Isao T. Tokuda, Shigehiro Miyachi, Jacob C. Dunn, Christian T. Herbst, Kazuyoshi Ishimura, Akihisa Kaneko, Yuki Kinoshita, Hiroki Koda, Jaap P. P. Saers, Hirohiko Imai, Tetsuya Matsuda, Ole Næsbye Larsen, Uwe Jürgens, Hideki Hirabayashi, Shozo Kojima, W. Tecumseh Fitch (2022). Evolutionary loss of complexity in human vocal anatomy as an adaptation for speech. Science, 377(6607), 760-763.