Senior Project Part 5: Presentation of Conclusions and Capital 2026 Research Symposium Submission

Feb 26

To conclude my senior project and audio perception experiment, I gathered and compiled all of my data into one single digital spreadsheet. Along with listing my own observations of the subjects, this compiling took anywhere between 25-35 hours to complete, filling almost one hundred rows and seven columns with paragraphs of data and thoughts on perception from subject to subject. I established a ranking of each film clip from most immersive to least, and after analyzing each subject’s immersion rankings, I was finally able to come to my first concrete conclusions.

Here are a few pictures of Subject #1’s answers to the audio and visual clips, as well as my own observations and rankings of immersion based on the subject’s perceptions. The rest of the results on this spreadsheet can be found linked here: https://docs.google.com/spreadsheets/d/1kfDCZSOdQXjMmycGeiqDzPm25-6wDGngGOkt8t_Xqwk/edit?usp=sharing

After analyzing each subject’s sets of answers, I was able to draw some strong conclusions for how each film clip was perceived, sonically and visually. I was able to take all of the subjects’ immersion rankings and establish reasoning for certain films to be more surreal than others, and the observations and conclusions for each film clip are included below:

2001 -

Almost every subject noticed some kind of perspective angle change, and for different subjects, this sfx had different effects; some noted the different spatial sound qualities of each new perspective (#3, #5, #6, #10) while others were confused by it, some even believing certain perspective changes to signify the end of the clip (#1 and #4)

Within the dialogue, most subjects were able to feel tension; the general consensus was that this tension was caused by a contrasting cold betrayal from one voice (HAL) and a desperation in the other voice (Dave); this emotional contrast was noted in some way by every subject, with each describing how unnatural one side felt and how desperate the other felt; some subjects even had physical reactions as effects from the dialogue, with subject #6 getting an increase in their heart rate and subject #9 getting clammy hands and bodily tension

The background elements of the clip provided some subjects with more immersion and others with disengagement; specifically, subjects #1, , #5, and #8 found the background robotic sfx to be more grounding for the scene, increasing their immersion in the other sonic elements; subjects #2 and #4 found these sfx to be slightly disengaging, acting as more white noise to the subjects than a part of the scene’s environment

For most subjects, the audio was what carried the majority of the message and tone of the scene; on the other hand, most subjects noted that the visuals portrayed a lot more of the setting than they could perceive from the audio alone; additionally, subjects noted that the introduction of visuals brought their stimulation levels down slightly, in most cases, from the additional story context added to the clip (visuals = more context = less stimulation)

A Quiet Place Day One -

As the most clip with the most intense and frightening sound effects, I honestly expected more negative, disengaged reactions from my subjects; I was expecting more than only subject #1 to remove their headphones because of the intensity, yet not many others did and a few subjects even seemed completely unfazed by the extreme sonic nature of the clip (#3 and #9 literally found nothing about the audio particularly unappealing or offputting); instead, many of the subjects were able to not only handle the audio, but had a very strong perception of how the sfx was making them feel

The driving emotion of the clip was undoubtedly fear; every single subject said this, with some adding variations to it (fear and panic, fear and confusion, fear and chaos, etc.) but the main theme of extreme fear is one of the few commonalities between all subjects undeniably; this fear arose from two main sources; the panic from the screaming within the clip and the confusion of not knowing what is happening (#2, #7, #8, and #10 mostly focused on the panicked fear from the screaming sfx while #3, #4, #5, #6, and #9 mostly focused on the chaotic nature of the scene’s sonic events)

Particularly notable is the change in auditory perception the clip had on the subjects as the fear and chaos subsided, replaced by a more sombre, grief filled score and more saddened emotions; subjects #2, 7, and 9 especially had almost a complete change in their emotional state from the beginning of the clip to the ending because of the intense difference between the high-intensity fear at the beginning and low intensity dismalness

With visuals, a lot of subjects experienced a decrease in their stimulation levels. While not every subject was able to articulate why, I was able to come to the conclusion that their emotions were far more relaxed and less intense with visuals because there was another major sense being used to get context for the audio; in other words, the visuals provided an explanation, whereas the audio alone painted an uncertain picture in the minds of subjects; this furthers one of my main conclusions that visuals introduce the context of a scene, thus making it more understandable, and thus less stimulating; this film makes me believe that the visuals of films, especially high-intensity ones, are used as a balancing depressant/stimulant for the audio, and vice versa

Baby Driver -

For this movie, a lot of subjects had a focus on the difference between the diegetic and non-diegetic audio; the subjects all noted the music that was playing throughout the whole clip and could hear the different sfx of a city perspective laid over top of it, which allowed a lot of subjects to understand the general setting and layout of the scene; all subjects enjoyed this combination of sound effects and music besides subjects #9 and #10, who found it to be chaotic and overstimulating; for the most part, this mixture of diegetic and non-diegetic sound made for a more immersive experience, allowing the subjects to get a solid understanding of the scene without necessarily needing much of the visuals

In terms of perception, save for subjects #9 and #10, all subjects had significantly accurate mental pictures of the scene; all of them noted the setting to take place in a city based on the movement of cars, sirens, and people talking; all of them noted the climactic cafe scene in which the MC goes into a shop and gets coffee, and about half the subjects understood a lot of important sonic cues in this scene that the other half of subjects either did not perceive or didn’t find notable enough to mention; half of the subjects were able to hear decreases and increases in the music (which is really hard to perceive without the visual cue in the scene) and most subjects were able to hear how sfx within this universe would match in beat and tempo with the music playing

The entire aire of the audio gave all subjects (save for #9 and #10) a general sense of confidence and peppy upbeatness; all the subjects could feel the ‘good vibes’ of the song and, with the sounds of the city, they could understand the setting was moving and parts of the audio were intended to sync with the music, which only served to increase the feeling of confidence in the subjects; I concluded that this increase in emotions was from an increase in immersion; with more city sfx being blended into the music of the scene, subjects were able to feel a deeper sense of immersion not just in the events of the scene but in the music as a part of the scene

I’d have to say that this is one of, if not the most immersive film shown to subjects across the board; about half the subjects noted it to be the most immersive film out of the seven based on audio alone, and the other half ranked it in at least the top 3 in terms of sonic immersion

Dune 2 -

This was the first clip where most, if not all, of the subjects felt that the visuals were necessary for them to perceive the scene. Subjects #4 and #9 were barely able to make any part of the scene out, noting that it sounded like randomized noise to them with only a few semi-intelligible parts; the rest of the subjects were able to more accurately perceive these parts, like big explosions or the battle sounds towards the end of the clip, but barely any subject had a solid sense of what was happening and what was causing the noises/sound effects; subjects #10 and #2 were the only ones able to hear more effects in the audio, noting they could envision a similar setting (sandy, windy) in their mental image of the scene

Additionally, this was the first clip where intense music, sfx, and dialogue are all used in tandem; the scene has a lot going on it it, particularly in the middle portion (climax), and all of these sounds were unable to be grounded without visuals; subjects noted that, once they were able to understand and see where each sfx and voice came from and could feel the emotions of the music intertwined in the visuals, they were able to understand so much more and perceive so much more detail in the sound effects; a good example of this is in subject four, who said their audio-only focus was mainly on the loud leaf noises and rattlesnake sfx, which don’t exist in the scene; the visual cues for these sfx were what made their perception and understanding possible, and so the visuals did most of the work of both telling the story and depicting its motion/change

This is also the first clip where subjects started to take note of sfx qualities, particularly regarding genre; for example, subjects #3 and #7 noted a medieval nature to the sounds of battle, and subject #3 even noted that these sfx contrast with the extra-terrestrial, sci-fi sfx throughout the entire clip; I suspect that this conflict in how subjects registered, recognized, and categorized the sound effects compared to one another led to a decrease in perceptive abilities; it also clearly led to an increase in stimulation, as this is one of the clips that people were most overstimulated for when giving the audio alone; if anything, this is one of the most disengaging films with audio alone, but one of the most engaging ones with visuals (since they carry the weight of the scene)

Dunkirk -

For this film, subjects’ perception split in one of two ways; they were either focused mainly on the perception of the major sfx, like the plane noises, gunshots, bombs going off, etc. or they were focused on underlying sfx like the heartbeat throughout the entire clip or the minimal background dialogue/environment noises; those focused on the larger sfx found themselves much more overstimulated, leading me to believe that sfx that are too notable or too prominent can be disengaging rather than immersive to an audience; those who focused on the smaller sfx proved a building conclusion to me that small, detailed, and intentional sfx that have a deeper meaning can provide an audience with more immersion within the gravity of the scene and film (like how subjects #5 and #6 were able to connect the heartbeat to the tension of the main character and even relate through minor physical reactions)

For many subjects, the main factor perceived from audio was similar to AQP, being fear, but in many cases it was a much more alert and apprehensive fear than a chaotic, panicked one; I suspect this is because subjects’ perception of the scene was easier to explain and much more accurate (due to the wartime sounds that are easily identifiable) versus the unnatural, inhuman, and frantic tone of the sfx within AQP; for the most part, a lot of subjects were able to clearly envision an accurate scene to what was happening in the visuals, with the subjects who had greater perception (#5, #6, #7, and #9) having a deeper understanding of the connection between the layers of sfx

The visuals provided different missing pieces for different subjects; subject #1 had a hard time understanding the scene and how it was playing out, so the visuals provided them with almost all the information about the setting, the time that the scene was set it, and the emotions of the characters present; however, for subject #7 and #9, they found that the visuals only served to help them perceive more sonic qualities of the sfx; I feel as though the visuals only provided for the subject what was proportionate to what they perceived; If the subject had a good perceptive understanding of the scene already, then the visuals didn’t add much themselves but added to the tension and weight of the audio; if the subject did not have a good perceptive understanding of the scene, then the visuals provided them with that information about the scene

Godzilla Minus One -

For each subject besides #8 and #9, the explosion was in the forefront of the subject’s perceptions; Each subject noticed some different effect on their perception that influenced the emotions they were experiencing, some feeling more anxious and some feeling more stimulated. Some subjects, like #1 and #2, were able to pick apart differences in the high and low frequencies, noting a vibrant contrast between the low-frequency of the explosion and the high-frequency of the destruction/glass shattering after-effects. The subjects who were able to perceive these greater details in the sound design of the clip, as a whole, were far more likely to be fully immersed in the scene, with subject #1 even imagining they were inside the scene while listening to it.

The visuals from this scene provided subjects with nearly all of the necessary context for them to understand the major cause of the sfx within the scene; for many subjects, they noted that the visuals were the main source of information, telling them about the setting, the characters, the cause of certain sonic events, and the aftermath of everything. The audio added a lot of scale to the explosion for subjects’ perception, as well as a ‘cool’ factor for subjects like #1 and #5. For other subjects, like #6, the visuals also provided a localization, explaining the setting on a grand scale (able to understand the setting is in a major city in japan just from recognizing the godzilla visual cues)

Since this was the shortest clip, there wasn’t much that I could interpret from the subjects’ perceptions that they hadn’t already stated themselves. For many of the subjects, not only was this the shortest clip but it was also one of the simplest, with all subjects apart from #8 and #9 understanding the general context of the clip from the audio alone.

No Country For Old Men -

With very little sonic elements present besides the conversation of the two characters, I was surprised by how far the subjects’ perceptions spread beyond the conversation; several of them, including subjects #1 and #5, were able to note environmental effects in the background, like the wind noise. Others, like #2 and #10, perceived eating and speaking noises (lips smacking, chewing, etc.) much more clearly and emphasized, more so than the clip depicts. While this wasn’t expected, it makes sense that less sonic stimulation via amount of noise leads subjects to turn their focus towards other sonic elements.

Despite Baby Driver being the overall most immersive film clip, I would have to say that the most immersive sound effect in particular was the coin flipping sfx from this scene. This was the most perceived sound effect across the board; every single subject said something about it and noted it as a pivotal part in the clip, which is exactly what it’s intended to be. Several subjects identified it as being the catalyst for the tension of the conversation; in subject #7’s own words, “[the coin flipping] went deeper than audio…it created extremely high stakes where none had existed before…” which added to subject #7’s and several other subjects’ discomfort.

The visuals served to add a decent amount of new perception and scene context but it was different from subject to subject. Some found that the visuals increased their unease and tension/anxiety because it allowed them to visually see the tension vs. just hear it (subjects #2, #3, #5, #8). Other subjects, like #1, #4, and #7, felt a small decrease in stimulation with the visuals added in, largely because the emotions were heightened with the lack of visual context. It was slightly disengaging for these subjects after adding visuals in (almost made it boring in a way). For this group of subjects, it seems that the audio alone carried the necessary information and emotional context to make the film understood and almost surreal in a way; subject #4 even noted to have felt like they were standing inside the conversation.

With all of this perceptive information and with my conclusions for each film, I hope to spend more time analyzing the data to get observations and conclusions that I can solidly prove across the spectrum of all subject perceptions. While I have a few observations and conclusions that I believe I can draw from the data sets, I want to ensure there are no parts of the data that contradict these conclusions. If so, I’ll need to take those into account when presenting my final conclusions at the Capital 2026 Research Symposium, which I will be applying for this coming week. Not only will this be a motivator for further research, but applying to present my advanced findings at the symposium will allow me more time to truly pick apart all the notable pieces of perception within the subjects’ responses.

Joe Bull

Senior Project Part 5: Presentation of Conclusions and Capital 2026 Research Symposium Submission

Senior Project Part 4: Conducting the Survey and the Resulting Data

Joe Bull