In this episode of Human Internet Theory, I explore the science behind why a human voice connects with listeners in a way an AI voice cannot. Recent brain scan studies show that our brains work harder and remember more when listening to a human speak vs an AI voice. I break down what this means for you as a small business owner, marketer, or digital educator and how you can use this knowledge to make your instructional content more effective.

I go over the core components of vocal expression, including speech rate, cadence, prosody, and intonation. Understanding these elements is the first step to controlling them and creating a delivery that feels authentic and engaging. I share some history from composer R. Murray Schafer, whose work on “soundscapes” changed how I perceive the voice as a musical instrument.

You’ll get a checklist you can use to analyze your own recordings and identify areas for improvement. I also walk you through three specific exercises for pace control, rhythm, and pitch variation. These drills are designed to help you add more variety to your delivery, hold your audience’s attention, and build a stronger connection with every episode you produce.

Resources Mentioned:

Previous Episode on Words Per Minute: On YouTube
Frontiers in Psychology Study (Human vs. AI Voice): Study
Book: “When Words Sing” by R. Murray Schafer: On Open Library
Free PDF Download: “Vocal Expression Booklet” at https://humaninternettheory.com

==========================

About and Support

==========================

Written, edited, and hosted by Jen deHaan.

Find this show on YouTube at https://youtube.com/@humaninternettheory

Subscribe to this show's newsletter for additional resources and a free 3 page workbook when you join https://humaninternettheory.com

Produced by Jen deHaan of StereoForest https://stereoforest.com

Contact Jen at https://jendehaan.com

==========================

Connect on Socials

==========================

Support

Your support will help this show continue. Funds will go towards hosting and music licensing for this show and others on StereoForest. This show is produced by an independent HUMAN artist directly affected by the state of the industry. StereoForest does not have any funding or additional support.

If you find value in our shows, please consider supporting them with a one time donation at https://stereoforest.com/tip

We love our podcast host Capitvate.fm! Contact me anytime to ask me anything. You can support my shows by signing up with Captivate here: https://www.captivate.fm/signup?ref=yzjiytz

==========================

About Jen

Jen's professional background is in web software technology (audio/video/web and graphics), working for many years in Silicon Valley. She has worked in instructional design, writing, marketing, and education in the creative space. She was also a quality engineer for awhile.

Jen became involved in performing, acting, and improv in 2015. She taught dance fitness classes (despite beginning with two left feet), performed in community theatre, and taught and coached improv comedy and acting at several theatres. Jen was also the Online School Director and Director of Marketing at WGIS.

Jen's website: https://jendehaan.com

This podcast is a StereoForest production. Made and produced in British Columbia, Canada.

Transcript

WEBVTT

:: 00:00

[MUSIC PLAYING]

:: 00:03

So recent brain scan studies have

:: 00:08

shown that your brain works harder and remembers more

:: 00:12

when listening to a human voice versus an AI one.

:: 00:17

We're going to break down why that is

:: 00:19

and how you can use that science in your own episodes

:: 00:22

and instructional content.

:: 00:24

Now, in the last episode, we talked

:: 00:26

about the speed that you speak at and how

:: 00:29

to figure out your own words per minute or WPM.

:: 00:34

That information will be useful for some of the exercises

:: 00:37

I describe in this episode.

:: 00:38

So if you missed it, you might want to check it out

:: 00:41

and then come back to this episode.

:: 00:43

I'll put the link in the show notes.

:: 00:46

Biconsciously and skillfully using the elements of your speech,

:: 00:51

things like your rate and cadence and pitch and volume,

:: 00:54

you can cut through the digital noise

:: 00:57

and then form a bond of trust that algorithms cannot replicate.

:: 01:03

It's your most effective instrument

:: 01:05

for reclaiming a fundamental part of our humanity online.

:: 01:10

We now have scientific proof that our brains respond differently,

:: 01:14

like more intensely, to a human voice

:: 01:17

than an artificial one, at least for now.

:: 01:20

In this episode, we're going to talk about vocal expression

:: 01:24

and the musicality of the voice.

:: 01:26

And how it affects those listening to it.

:: 01:30

We will deconstruct the technical components of speech.

:: 01:34

And I'll offer a really useful toolkit,

:: 01:37

I hope, of exercises that you can use to practice.

:: 01:41

And I'll also help you set up a way to analyze your own work.

:: 01:47

Now, before we go too far into this topic,

:: 01:50

the words you say, like the information itself,

:: 01:54

those things are very important.

:: 01:57

For example, the research on improving learning effectiveness,

:: 02:01

the research says things like reducing complex and long words,

:: 02:06

avoiding really long sentences.

:: 02:09

So the content of your message is the foundation

:: 02:12

for all of this in general.

:: 02:14

But in this episode, we're specifically focusing

:: 02:17

on how you say those words.

:: 02:19

So the tone, the pitch, the speed, the rhythm, the pacing.

:: 02:21

So we're going to get to those practical exercises,

:: 02:24

but first we're deconstructing the different parts

:: 02:27

of vocal musicality.

:: 02:28

Understanding these components is the first step

:: 02:31

towards controlling them.

:: 02:33

So let's break down the components

:: 02:34

that make your voice expressive instead of robotic monotonous.

:: 02:39

So the first one of those things is speech rate.

:: 02:42

So this refers to the speed at which you speak.

:: 02:45

And we discussed in the last episode one way to measure that,

:: 02:49

and that's in words per minute, or WPM.

:: 02:52

For those of you who script your work, WPM

:: 02:55

is also a really useful way to measure your pace,

:: 02:58

because you can work out your transcripts,

:: 03:01

how long it's going to take you to give a performance,

:: 03:04

for example.

:: 03:05

Next we have cadence.

:: 03:07

So speech rate is the speed, and cadence is the rhythm.

:: 03:11

Cadence is like that rhythmic flow and pacing of your words,

:: 03:16

and this can create a sense of melody as well.

:: 03:20

Each language has its own natural rhythmic patterns.

:: 03:24

And mastering this, getting cadence right down,

:: 03:27

is essential for delivering performances

:: 03:30

that feel really authentic and human and engaging.

:: 03:34

Next there's prosody.

:: 03:36

This is a really broad term for the rhythmic

:: 03:39

and tonal features of speech that go way beyond individual

:: 03:44

sounds like the vowels and consonants of a word.

:: 03:47

So prosody includes the intonation, stress, rhythm,

:: 03:52

loudness, duration, the intensity, and the pitch.

:: 03:57

And prosody is what gives your words

:: 04:00

that sort of nuanced meaning that we kind of just

:: 04:04

understand without it being spelled out,

:: 04:06

like your emotional state or your attitude, what

:: 04:10

you're feeling towards the subject.

:: 04:12

Prosody is how you communicate things like sarcasm and irony.

:: 04:17

And it's how you distinguish a question

:: 04:20

from a statement of fact.

:: 04:23

Finally, we have intonation.

:: 04:25

So that's technically part of prosody.

:: 04:27

You might have heard me say that.

:: 04:29

But intonation is so important that it's kind of worth calling out

:: 04:33

on its own, because it's that melody of a sentence

:: 04:37

in particular.

:: 04:38

So the rising intonation at the end of a sentence--

:: 04:42

in Canadian English, this is typically a yes or no

:: 04:46

question that I'd be asking.

:: 04:48

While a falling intonation means I'm

:: 04:51

making a statement of fact.

:: 04:52

This can be different region to region

:: 04:55

and even within the same language.

:: 04:57

So you also use intonation to convey emotions and attitude

:: 05:03

and to draw attention to important words.

:: 05:07

These things all go together to create the music of your voice.

:: 05:13

So putting it all together, speech rate, overall tempo,

:: 05:17

how fast or slow the piece is played.

:: 05:19

Intonation is that melody, like the sequences of high and low

:: 05:24

notes.

:: 05:25

Volume and stress are the dynamics of your words.

:: 05:30

These are like the variations in loudness that create emphasis.

:: 05:34

And cadence is the resulting rhythm of your words.

:: 05:39

The characteristic pattern that emerges

:: 05:41

from the combination of all these things together.

:: 05:46

So the musicality of your voice is so important when

:: 05:49

you're educating and presenting or you're acting and doing

:: 05:52

a character, because they're all different character

:: 05:56

to character.

:: 05:57

So I first started thinking about this a lot

:: 06:01

after I read a book by Armory Schaefer,

:: 06:03

and that book was called "When Words Sing."

:: 06:07

Schaefer was a composer who saw all sounds,

:: 06:11

including the voice musically in a music context.

:: 06:15

He taught that we should listen to the pitch and the texture

:: 06:19

and the rhythm of words.

:: 06:21

We should feel them and express them.

:: 06:24

He wanted to listen to words like music.

:: 06:28

So here's a quote from a paper by Schaefer.

:: 06:32

"What I mean by acoustic design is

:: 06:36

to regard the soundscape of the world

:: 06:38

as a huge musical composition unfolding around us ceaselessly.

:: 06:44

We are simultaneously its audience, its performers,

:: 06:48

and its composers."

:: 06:50

I found that just gorgeous.

:: 06:52

And I read his book a few decades ago

:: 06:55

when I was at university, and it just completely changed

:: 06:58

my mind on listening to the voice.

:: 07:01

I started listening to the voice as a musical instrument

:: 07:06

just while it talked.

:: 07:08

An example I remember from the book

:: 07:09

was listening to the sounds within a coffee shop,

:: 07:13

which he called "Soundscape."

:: 07:14

So he saw those as full, orchestral-like pieces of work.

:: 07:19

He suggested that you could listen to them creatively

:: 07:22

and artistically like a symphony, all the voices,

:: 07:26

and the other sounds in that coffee shop.

:: 07:30

So the book and his recordings,

:: 07:32

he had recordings of this stuff,

:: 07:33

really changed the way that I listened.

:: 07:37

And it changed the way that I spoke too from that point on.

:: 07:41

Now Schaeffer's emphasis on this inherent rhythm of words

:: 07:47

connects directly to cadence and prosody.

:: 07:50

He talked about how vocal variety is what we use

:: 07:55

to sustain an audience's attention.

:: 07:58

So monotony means boredom,

:: 08:00

and that feeling translates directly to your listener,

:: 08:05

whoever's listening to what you say.

:: 08:07

And as a side note, after reading this book,

:: 08:10

I had to give a presentation on my report

:: 08:13

when I was at university.

:: 08:15

And I was so excited because this had just like

:: 08:18

blew my mind thinking about these topics,

:: 08:21

but I have never seen such a bored audience

:: 08:25

in my entire life when I gave that report.

:: 08:28

Even though I felt super excited,

:: 08:31

obviously it didn't translate to my voice,

:: 08:34

my delivery, I think went completely against everything

:: 08:38

that I learned in the book.

:: 08:39

Here I am gushing about this book decades later still,

:: 08:44

and hopefully you're a little bit less bored

:: 08:47

than my audience back then.

:: 08:49

Hopefully that's true,

:: 08:50

but you can let me know in the comments

:: 08:52

or send me a message if you're listening to the podcast.

:: 08:56

You can say, no, I'm still incredibly bored

:: 08:58

by that whole Schaeffer thing.

:: 08:59

This script, the content of your message

:: 09:02

is just one layer of what you're putting out there.

:: 09:06

So how you say things can completely change the meaning

:: 09:11

of the words that you're actually saying.

:: 09:14

And this can change the impact that you have

:: 09:17

on the audience, your students,

:: 09:18

whoever it is listening to you.

:: 09:21

And this isn't just a feeling or a philosophical idea

:: 09:25

that I have, it's actually something

:: 09:27

that we can now measure.

:: 09:29

So a recent study that was published

:: 09:32

in the journal Frontiers in Psychology

:: 09:34

looked directly at this issue.

:: 09:37

Researchers used an EEG to monitor people's brain activity

:: 09:41

while they listened to both human

:: 09:44

and AI-generated newscasts in China.

:: 09:47

As a result, they found out that the human voice

:: 09:50

triggered significantly greater brain activity

:: 09:54

than the AI voice.

:: 09:56

Specifically, it targets the parts of the brain

:: 09:59

that are associated with processing,

:: 10:02

comprehending, auditory information, and working memory.

:: 10:06

Those were all a lot more active

:: 10:08

when they were listening to a human.

:: 10:11

This data suggests that the subtle,

:: 10:14

natural human variations in our voices

:: 10:18

that musicality or prosody would be involved

:: 10:22

are more than just the noise of the words coming out.

:: 10:26

That it's like not a benign thing.

:: 10:28

What you add as a human being are signals

:: 10:31

that can help your listeners,

:: 10:34

their brains engage and understand

:: 10:37

and remember the things that they are hearing from you.

:: 10:41

So the psychology of you being heard is real.

:: 10:45

It's a biological response that can be measured

:: 10:49

by an EEG at least.

:: 10:51

So I'll put a link to that study in the description

:: 10:53

if you want to check it out.

:: 10:55

So the human brain is tuned to decode this rich stream

:: 11:00

of social and emotional information

:: 11:04

that's baked right into the human voice.

:: 11:08

Listeners, your audience subconsciously use

:: 11:11

these vocal cues that you put out

:: 11:14

to construct a mental model of the speaker, you,

:: 11:19

and they make judgments about identity and personality

:: 11:23

and confidence and intent.

:: 11:26

I'll talk more about how your volume and tonality

:: 11:30

add elements of persuasion and motivation

:: 11:33

in the next episode.

:: 11:35

I want to spend the rest of our time today

:: 11:38

on how you can analyze your own recordings

:: 11:41

and then get into some practical exercises.

:: 11:45

So in the last episode,

:: 11:46

we talked about how variability is important.

:: 11:50

You need a range of speeds and delivery styles

:: 11:53

to just suit your content and your platform.

:: 11:56

Variability is really important.

:: 11:58

This makes your delivery feel natural

:: 12:01

because it's what we just do day to day as humans

:: 12:05

when we talk to other humans.

:: 12:06

It also naturally keeps human attention.

:: 12:10

So here's a checklist that you can use for your analysis.

:: 12:14

Compare your words per minute between your videos.

:: 12:17

Look at the WPM between different episodes

:: 12:19

and especially between different sections of the same episode.

:: 12:24

Like when you tell a story versus when you give a tutorial.

:: 12:27

Next, you want to listen for vocal variety.

:: 12:30

So is your delivery monotonous?

:: 12:33

Does your pitch go up and down?

:: 12:37

Does it stay all on one note?

:: 12:39

Is your volume dynamic?

:: 12:42

Does it change or is it flat?

:: 12:45

Next, you want to consider filler words.

:: 12:47

So you might want to count them up if you want

:: 12:50

or just listen for particular repeated filler words

:: 12:54

that you tend to use like um or like or you know, that's mine.

:: 12:59

So some of these are fine to leave in

:: 13:02

for a conversational feel.

:: 13:03

Like we have a reason as humans to use filler words.

:: 13:06

Like I'm not finished speaking yet.

:: 13:09

So you kind of put that filler word in

:: 13:10

because it signals to another human

:: 13:13

that you've got something more to say.

:: 13:15

But in a video, we don't need to do that

:: 13:17

'cause I have the control of the room here.

:: 13:19

I'm editing this thing.

:: 13:20

So too many of them might be distracting

:: 13:23

because they're not needed.

:: 13:24

The last one here is check your articulation and clarity.

:: 13:29

So is every word crisp and clear

:: 13:31

and easy to understand

:: 13:33

or are you mumbling or running words together?

:: 13:36

The articulation can make you sound unprepared

:: 13:39

and it's also harder to edit.

:: 13:42

Now that you've listened to your recordings

:: 13:45

and you have kind of an idea about what you want to work on,

:: 13:48

here are three exercises that can get you started

:: 13:51

on that work.

:: 13:52

And I'll also have some bonus exercises

:: 13:55

available on my website.

:: 13:56

So go to humaninternettheory.com

:: 13:59

and you can grab the vocal expression booklet

:: 14:02

on the front of the website there for free.

:: 14:05

So exercise number one is about pace and rate control.

:: 14:09

Bring up a paragraph to read.

:: 14:12

It might be something that you've written

:: 14:13

or content similar to what you produce.

:: 14:17

And you're going to read this paragraph

:: 14:19

allowed three times in a row.

:: 14:21

And I have a link to a bunch of public domain resources

:: 14:24

and scripts in that PDF if you want to use that.

:: 14:28

At first, read it as quickly as you can

:: 14:30

while staying perfectly clear.

:: 14:32

So focus on that clarity.

:: 14:35

And next, do it the opposite.

:: 14:37

Read it very slowly and deliberately,

:: 14:41

exaggerating and enunciating every single word.

:: 14:46

And for the last read,

:: 14:48

just find a middle ground between those two.

:: 14:50

Vary the speed as you see fit, as is appropriate,

:: 14:54

reading some parts faster and other parts slower.

:: 14:58

All right, so the next is about cadence and rhythm.

:: 15:01

So one of my favorite tools is the pause.

:: 15:05

I love using it and character work and acting especially.

:: 15:09

So for this exercise, take a script

:: 15:11

that's a little bit longer than a paragraph,

:: 15:14

maybe a few paragraphs,

:: 15:15

and practice inserting pauses for emphasis.

:: 15:20

So add a one or two second pause right before

:: 15:24

or right after keywords and phrases.

:: 15:27

As you read, scan for what's important

:: 15:30

and needs that extra emphasis.

:: 15:33

And exercise three is all about pitch and tone.

:: 15:38

So for this exercise, you want to choose a simple phrase

:: 15:42

and say it in as many different ways as you possibly can.

:: 15:46

So you'll be thinking about your pitch,

:: 15:48

your volume, your intonation,

:: 15:51

and all the different ways to enunciate the phrase.

:: 15:54

Think about different emotions for this one.

:: 15:57

And the free PDF download will have a long list of prompts

:: 16:01

that you can use for this.

:: 16:02

So you might have a simple, non-emotional phrase

:: 16:06

like damp basement or two bright lights or long tabby cat.

:: 16:11

You're gonna say one of those phrases in different ways

:: 16:16

to convey different meanings and emotions,

:: 16:19

realizing that there's a lot of ways

:: 16:22

to deliver the same words.

:: 16:25

Again, you can use these exercises

:: 16:27

and more from the Vocal Expression booklet,

:: 16:30

which you can grab at humaninternettheory.com.

:: 16:34

Now that you have a framework

:: 16:36

for understanding the musicality of your voice,

:: 16:39

a method for analyzing your own delivery,

:: 16:42

and three exercises to start practicing.

:: 16:45

Now your voice, think of it as an instrument.

:: 16:49

And like any instrument, it requires practice,

:: 16:53

those reps.

:: 16:54

Next Thursday, we'll get into the psychology of being heard

:: 16:59

and a delivery style,

:: 17:00

the conversational, presentational model.

:: 17:04

So I'll be back next week for another episode.

:: 17:06

Bye for now.

:: 17:07

(upbeat music)

:: 17:10

Episodes are written, directed, edited,

:: 17:12

and produced by Jen of StereoForest.com

:: 17:16

Find out more about this podcast

:: 17:18

and join our free newsletter for additional resources

:: 17:21

at humaninternettheory.com.

:: 17:24

Find additional videos at the YouTube channel

:: 17:26

called Human Internet Theory.

:: 17:28

Links are also in the show notes.

:: 17:31

(upbeat music)

:: 17:33

(gentle music)

Vocal Delivery for Creators: Practical Drills to Improve Your Sound

About and Support

Connect on Socials

Support

About Jen

Transcript

About Jen deHaan

Reader Interactions

Leave a Reply Cancel reply