top of page
仙人掌模式
IMG_5106_edited_edited.jpg

Introduction

Utilising multiple types of hardware and/or software, our group of 4 began the assignment with the goal of achieving a unique and educational experience. This resulted in a system combining a p5 sketch with Arduino and Microsoft Kinect input to aid students in learning musical skills through matching melodies. 

With these general attributes in mind, the group began to solidify a loose plan for a game that could use some kind of hardware input and visual representation method to teach students how to associate audible music with visual cues. To collect some inspiration for the gameplay and visualisation and form a basis for further discussion, we created the Miro mood board and many sketches. With our direction confirmed, we then moved on to the planning and design phases.

  • GitHub

01/04/2023 - 20/04/2023

System Design
- Literature review -

Because our focus was education, we began the process by performing a literature review of relevant research that could potentially inform our design. Since our experience was planned to be educational, involving some kind of input and output related to music, we investigated the domains of human computer interaction, computer science and psychology to find sources that could illuminate some benefits or risks of this type of application in education.

Firstly, we investigated the potential benefits of gamification of education. Playing with buttons and digital bars emulating musical keyboards and musical bars helps children hold in mind multiple representations of the same musical concepts, which is a skill required for the development of symbolic thinking [1]. As Lee and Hammer note, the nature of gamification allows for “resilience in the face of failure, by reframing failure as a necessary part of learning … Students, in turn, can learn to see failure as an opportunity, instead of becoming helpless, fearful or overwhelmed” [2]. In the context of our experience, learning music can involve many, many mistakes – through practise, a student can hone their skills, but for some students, that failure might feel overwhelming. In this way, gamification could help keep a student motivated in their studies.

As far as non-standard hardware input being used in music creation/education, some academic studies about HCI applications in music learning have found that there is a benefit associated with this as well. Bevilacqua et al found that, in testing a wireless sensor interface to “conduct” music in a music theory course, “students were highly motivated by the experiments… [and] immediately pointed out its creative potential” [3]. It found that the instructor also led to improvement of student understanding of concepts such as musical phrasing.

Educators especially seem to note the potential benefits of this kind of method being used for music education. In Sökezoğlu et al’s study [4], when interviewing music educators about their opinions about the Arduino-based touch music box tool which could be customised to be used for various music output, they noted the potential for the hardware to allow students to create their own learning style, potentially in a game-like style, and also “[see] and [apply] the abstract concepts of music”.

Many pieces of research in psychology and neuroscience have also shown that there are similar neural mechanisms and cognitive processes involved in motor memory and music memory. For example, both motor memory and music memory involve activity in the cortex and basal ganglia regions of the brain, especially in the motor and auditory areas. In addition, both motor memory and music memory require a sense of timing and rhythm perception, as precise timing and coordination are necessary for playing musical instruments or performing physical movements [5, 6, 7, 8]. Therefore, in the early stages of education, introducing multiple sensory memories can help students better understand and master interlinked concepts.

System Design
- User story and requirements -

After affirming that gamification and intuitive motion inputs can improve the learning process for students, we began planning how to execute our idea of the educational music game, we firstly referred to the UX design process outlined by Hartson and Pyla in The UX Book [9]. As our general focus had been decided, we were at the stage of the UX design lifecycle where we needed to understand our user’s work and needs.

Firstly, we each considered who we imagined would want to use or could benefit from this type of experience and came together to create a user story which would describe the minimum viable product, or smallest unit of a desired functionality, of a representative user. We then extrapolated some user requirements based on this description according to Hartson and Pyla’s UX design guidelines. Sketches describing our initial ideas on these interactions are included in Appendix B. Learning to play an instrument and read music at a young age is a great opportunity, but it can also be dull or even difficult and confusing. The leap from the audible sound of music to the abstract representation of it on the page can be a large one for anyone, but especially children. In a classroom or personal setting, the experience our group had brainstormed could ideally be used by young people to play and create music freely while seeing how that music is visualised, either as an arc following the melody or as notes on sheet music, helping bridge that gap.

 

With this user in mind, the team defined the following user requirements that could be further developed into features for the final experience.

u=816682230,1669738953&fm=253&fmt=auto&app=138&f=JPEG-2.webp

Playing music

The user should be able to use some kind of input apparatus to “play” music. The user input should come with a corresponding audio output.

00458PICctGr58PICMWbXDf6d_PIC2018_PIC2018.jpg!w1024_new_0.jpeg

Visualising music

The music played by the user should be displayed visually in some format.​

u=1589187594,3008620436&fm=193.jpeg

Playing back music

The music played by the user should be able to be played back and listened to again, ideally with the corresponding visual representation.

1000_edited.jpg

Gamification

Given that the target audience is young people or children who are learning to play an instrument and/or read music, an element of gamification could enhance the educational process by making it enjoyable.

87D58PICtGjBf8y7kvrER_PIC2018_edited.jpg

Competition, or Co-operative Play

To reinforce learning outcomes, collaborative or competitive elements involving another user could be added.

To meet these requirements, the group considered each of the hardware and software choices made available for the practical. For the visual element of the experience, we considered the use of LED lights to represent the sound waveforms of the music, but since we hadn’t yet decided on a final visual direction and wanted something that could flexibly work with any idea, the team chose to use p5 for the software portion of the experience, showing visuals on a computer screen.

In the same vein, the group also considered which hardware options they should use for user input and output. Our theoretical user was a young person who is beginning their music studies, so an assumption could be made that they do not know how to play an instrument yet. In this sense, the group thought that an input method that was abstract rather than a reproduction of a specific instrument would be better for young users, as well as possibly being more accessible for users of differing abilities. This led the group to the Microsoft Kinect, which allows user input with larger-scale bodily movement. They also opted to request an Arduino Uno as another input method, potentially with keys. With the use of these two hardware inputs, the collaborative/competitive element considered earlier in the planning process could potentially be implemented in a dynamic and interesting way.

Feature Design and Implementation
- Overall Architecture -
Untitled-4.png

With the hardware and software choices solidified, the group began to design the fine details of theexperience. This included specifying how it would work, what it would look like, and how the experience would flow. We aimed to develop a game system for music education purposes that provided early note recognition. To simulate the input of real piano keys, we decided to build a simple electronic piano using Arduino, allowing for the input of eight basic notes.

In addition, we decided to use the Kinect as a form of motion control, allowing notes to be played based on the height of a user's right hand and gestures. As a central component, we decided upon a p5.js-based sketch [10] wherein the two inputs would be queried as necessary – so that visual feedback could be provided and differences in inputs from the two sources could be calculated.

With all of this, we now had a good idea of the basic architecture for our system (shown in Figure 1) – enabling us to work on and design the individual sub-components. During this process, we considered several standard interaction design guidelines popularized by Don Norman [11] and Ben Shneiderman's golden rules for interface design [12] to make informed design decisions that would facilitate an engaging user experience.

Feature Design and Implementation
- Software  p5.js Sketch -

As the core visual interface of the overall system, a lot of consideration was put into the design of the p5 web interface and the underlying code structure to facilitate a meaningful experience for users.

At a high level, the flow of the game was controlled by a single variable gamePhase that was checked in each draw ( ) iteration to determine what type of screen to draw onto the canvas - with the variable being updated during key events/transitions within the game itself. This allowed for enhanced modularity in the codebase, enabling individual methods (e.g. drawMenu ( ), drawOptions( ))to be written for each phase screen whilst ensuring that the phases were isolated from one another in terms of what was drawn through a single switch case. Alongside this, as raw HTML elements were also being created in conjunction with the p5 canvas handling the visibility of these elements was also of the utmost import. Primarily, this consisted of segmenting each screen UI into individual divs, and setting the display attribute as necessary to ensure that only one UI would be visible at a time.

From a HCI perspective, this helped ensure that the screen was never cluttered with off-screen elements – and would only ever contain relevant buttons. Of course, when discussing UI another important factor to consider is the overall visual design of the interface components. Given the educational nature of the tool, we chose to follow Norman’s design principle of visibility with a streamlined and minimalist design, and menu buttons within the UI were given a consistent, childlike aesthetic with bright highly visible colouring and rounded corners. Text within the buttons was also made to be short, simple, and directed so as not to overload a potential user. Additionally, keeping the UI aesthetic would contribute to the aesthetic-usability effect [13] and potentially even further the engagement of the game.

After that, another critical component of the p5 implementation was the input handling. Given the lack of duplicate input sensors, initial sketches utilised the mouse as the primary navigation device. As the system grew, the other input sensors were built in to work as alternatives to the mouse via the dispatching of mouse events. As such, users were able to use a multitude of input methods in an analogous fashionincreasing user control and freedom. In a similar vein, an options menu was implemented within the game to further the idea of user freedom through the customisation of constrained options such the tempo, or if helper text should be displayed. Navigation in the sketch was also handled in a unique fashion, done solely by the Kinect or mouse. As it has been noted that the Kinect is less reliable in movement than a mouse [14], it was decided that users could either hover over a button for a set amount of time to select it or click it by closing their fist. With this, more experienced users could use the shortcut of clicking whilst novice users could intuitively learn how to navigate the screens through the informative feedback provided by an outer arc filling up around the cursor when hovering over a button.

Prior to any gameplay, users would be provided with a tutorial screen detailing how the flow of the game worked – as the closest form of documentation that could be provided. This tutorial could be optionally turned off in the options, so that expert users would not need to see the same screen repeatedly.

Within the core gameplay screen, users were presented with a 5-line stave and a treble clef – familiar symbols that would utilise a user’s recognition over recall. Over time, a line would move across the stave horizontally for 8 beats and users could play notes to add to the stavewhere the line was currently as an intuitive mapping of how sheet music would be played (left-to-right). Furthermore, if the Kinect was being used as inputthe potential note that would be played if a user closed their fist was also displayed to inform and help the user know what would happen.

Besides showing potential notes for the Kinect, notes being played were also drawn in real-time on the canvas as input from any source was recognised (if it was that source’s turn) to provide informative feedback to users. This was complemented by the p5.sound library [15] playing the note in sync with a user’s input. An envelope was used to play notes to reduce the robotic-sounding nature of a basic sine oscillator by defining attack and release speeds – facilitating a more welcoming experience.

Once a melody had been recorded, users would be directed to a so-called ‘retry’ screen. From here, they could re-record their melody, a feature permitting the easy reversal of actions to undo user mistakes. Alternatively, the user could play the recorded melody to be provided with further informative feedback about the current state of the game. Finally, the last option in the menu allowed a user to use the melody recorded as the one the second player would need to match.

As a small aside, during the phases when a melody would be recorded (either to match the previous melody or create a new one) a short ready countdown was implemented. This was primarily done to reduce stress on the user and provide them with time to adjust to the new game state prior to them having to play notes – reducing their cognitive load massively. Next, after both users had finished their recording phases an accuracy score would be calculated and drawn to the screen. This signalled the end of a game, and as such included text to indicate closure (“Thanks for playing!”), and a button to return to the main menu. Finally, it should also be noted that there were several constraints imposed in the game, such as limiting the length of a melody to two bars, and only allowing notes C4-E5. This was a conscious decision made to convert the constraints in hardware both in size and power into advantages in software, as per Norman's guidelines.

Feature Design and Implementation
- Hardware Arduino -

Given the potential of the Arduino as a stand-in musical instrument, we decided to construct it as a button based piano analogue. Initially, eight buttons were connected in a series circuit to the A0 pin. Through this, the Arduino could determine which button was pressed by detecting the voltage value, and the duration through changes in the voltage – providing both the note and duration aspects required as the desirable input to the p5 sketch. 

To connect the Arduino to p5, the p5.serialport library [16] and the p5.serialcontrol app [17] were used in parallel. As p5 cannot access serial ports directly when a sketch is running due to browser restrictions on accessing serial ports, p5.serialcontrol enables this connection through a WebSocket as shown in Figure 2 with this, the Arduino and p5 sketch could communicate effectively allowing note presses to be recorded and displayed on the sketch window.

Although the provided feedback drawing a note in p5 on input is intuitive to a user viewing the screen, it may not be promptto obtain the test results as it requires a shift of focus from the Arduino to the computer screen – and the timing of feedback is known to be highly important [18]. In addition, it has been demonstrated that immediate feedback is more effective for immediate error correction during tasks[19].

Thus, it was decided that LEDs could be added to the Arduino's breadboard as a form of more direct visual feedback, enabling faster verification of incorrect input. During development, we discussed several alternatives to this direct feedback. Before adopting the idea of LEDs, we considered making buttons automatically dance according to gestural inputs, but the bouncing of the keyboard is not as intuitive as LEDs due to the button size. According to Fitts' law [20],the intervals and sizes of changes in position will affect people's perception and judgment of the object. The brightness changes of an LED, with smaller and more distinct intervals, are more easily noticeable compared to subtle and delayed mechanical changes and are also much simpler to implement. By using a glowing LED, the visibility of feedback was enhanced while indicating to users whether a signal has been inputted, thus improving system controllability. LED coloring was also made consistent, making it easier for users to understand and predict the system. With this, the LEDs allow feedback to be timely, clear, noticeable, and understandable.

In addition to the Arduino providing input to p5, it was also used to handle output when a melody was being replayed after the first player had recorded it. This was primarily done to keep the visual feedback through LEDs consistent at key points of the game, as either the Kinect or the Arduino could be player 1 and so replaying the melody should have feedback for both players. To implement this, p5 would send a note alongside the duration to the Arduino, which would light up the corresponding LED for that time.

Feature Design and Implementation
- Hardware Kinect -

Second, the other input device chosen for this system was the Microsoft Kinect, which would capture body motion. It was decided that this motion would map to x and y coordinates on the screen as a pointer analogue as this was a natural movement mapping known to facilitate more engaging user experiences [21].

In terms of the physical implementation, several iterations were developed prior to the final integration. Firstly, a rather trivial approach was attempted through the KinectMouseControlV2 software [22] allowing the Kinect to act as a stand-in for a mouse pointer which would work well with the p5 sketch which had been built to readily accept mouse input. However, it was quickly noticed that the software was not incredibly reliable in terms of recognising gestures, and as it also communicated over serial it restricted the Arduino and Kinect to be connected to the core laptop which cluttered up the space quite a bit.

As such, a webserver-based approach was taken – which would allow for refined control of the Kinect output as well remove the restriction of the Kinect having to be in the same room/location as the p5 sketch and Arduino. This was facilitated by the Kinectron software[23], which automated the process of setting up and broadcasting the Kinect input to any connected clients. Through this, the p5 sketch could connect to the Kinect through a web socket and define a call-back function (trackSkeleton ( ) ) every time motion information was received.

Using the provided information to control the mouse cursor was a trivial process, simply updating the mouseX and mouseY variables. It was noted during testing that the Kinect can be slightly finnicky with the returned hand coordinates, and as a result some simple smoothing calculations were done in the callback function to reduce erratic movement of the cursor to aid the user's internal locus of control.

In terms of mouse events such as clicking and hovering, this was handled simply through checks on therightHandState variable passed from the Kinectron web server. Through this, mouse events could be dispatched for clicking and releasing based on if the right hand of a user was closed in a fist or opened.

 

Additionally, for the hover navigation a variable was used to constantly store the element that the hand was currently over. If that ever changed, a mouseover event would be dispatched to the new element, and a mouseout to the previous element – enabling the implemented hover functionality to work seamlessly with the Kinect.

Evaluation

Regarding the overall user experience, the game offers deep immersion, engaging multiple senses such as hearing, touch, and sight. Users coordinate their eyes and hands, boosting motivation and focus, resulting in a captivating experience. Despite involving multiple senses, the game remains user-friendly, requiring simple operations for successful gameplay.

 

In terms of design, all user requirements outlined in the stories were met. The game aims to teach children basic musical concepts and foster interest in music and instruments. This was achieved through vibrant webpages, keyboard-like buttons, bright LEDs, and motion capture devices.

 

However, the system has some room for improvement. One issue observed with Kinect navigation is occasional cursor erraticism. While smoothing functionality was implemented, it could be refined further, possibly with Kinect Studio's VisualGesture Builder. Additionally, manual Kinect position adjustment for users of different heights could be addressed with a calibration phase accessible from the options menu.

 

On the hardware side, the Arduino effectively simulates a traditional piano but could improve simultaneous button press handling. Presently, lower voltages take precedence, potentially overriding buttons to the left. Assigning individual pins to buttons on a larger microcontroller like the UNO Mega could resolve this.

 

Overall, the implemented system meets user story requirements, providing an engaging user experience.

Conclusion

Ultimately, although it is only a proof-of-concept at this stage, we believe we were able to produce a dynamic, interactive, and educational experience that could be used to enhance a student’s musical skills while also allowing them to enjoy playing it with another person.

Over the past few weeks, we have been able to utilise HCI methodologies to fully consider our potential use cases and design a system that would result in a meaningful experience. Our final software demonstrates a prototype that combines and uses two hardware input sources in an innovative fashion, with design decisions justified based on existing literature and systems.

While there are minor improvements that could be made, the interaction with the system is smooth and intuitive, and manipulates content in a logical way. Through this, we believe we have constructed a system that can be used in the education domain as a tool to enhance students’ memory and musical talent, as well as their natural mappings – justified through existing research in psychology. The competitive elements in the gamecould also increase motivation to play repeatedly, which could lead to better learning outcomes.

Future Work/Extensions

01

2-player Kinect mode

Currently the way we use the Kinect requires that only one person be in view of the camera or else it can get confused and the user input becomes erratic. If we were to add a full-fledged two player mode to the experience, we could plan for this sort of input and calibrate and program the Kinect to utilise its skeleton recognition functionality to capture the movements of two players. This could enable a myriad of new competitive or cooperative modes.

02

Variable sound output

The game is currently configured to play frequencies corresponding to musical tones through the browser on user input and the tones of the music sound artificial. The output could be modified to use external sound files to play real instrument sounds, and the user could potentially select which instrument they would like to hear when playing the game.

03

Playing against pre-recorded music tracks

The game is currently based entirely around user input, but it could be modified to pull in and play pre-recorded external music files and the user could be prompted to match the music with their input. This could potentially enhance the single-player experience and increase long-term interest in using the game to practise.

04

Saving recorded melodies

After playing through the game cycle, the game could save the user’s input to an mp4 file so that it could be played back and analysed as necessary players or teachers.

05

Variable melody length

The game currently only allows for 2 bars of music and the printed notes are simple dots. To allow for more advanced song composition, music practice and gameplay, we could allow the user to freely configure the length of their game cycle themselves, extending the recording to the desired length of time.

06

Long distance competitive play

Kinectron has the functionality to broadcast Kinect data through a public address, which could enable the Kinect to be in a completely different location to the p5 and Arduino. This could also be facilitated by inter-sketch communication by p5 allowing two players in different countries to play with one another.

07

More notes/flats/sharps

Currently, we only allow notes in the C4-C5 scale plus a few additional higher notes, but the game could be modified to include any type of input, including a full range of scales and flat/sharp notes.

Demo Video
References

[1] DeLoache, J. S. (1987). Rapid Change in the Symbolic Functioning of Very Young Children. In Science (Vol. 238, Issue 4833, pp. 1556–1557). American Association for the Advancement of Science (AAAS). doi: https://doi.org/10.1126/science.2446392 

[2] Zeybek, N., & Saygı, E. (2023). Gamification in Education: Why, Where, When, and How?—A Systematic Review. Games and Culture, 0(0). doi: https://doi.org/10.1177/15554120231158625 

[3] Bevilacqua, F., Guédy, F., Schnell, N., Fléty, E., & Leroy, N. (2007). Wireless sensor interface and gesture-follower for music pedagogy. In Proceedings of the 7th international conference on New interfaces for musical expression - NIME ’07. the 7th international conference. ACM Press. doi: https://doi.org/10.1145/1279740.1279762 

[4] Sökezoğlu Atılgan, D., & Gürman, Ü. (2020). Material Design in Music Education Using Arduino Platform. In Journal of Qualitative Research in Education (Vol. 8, Issue 4, pp. 1–26). Ani Publishing and Consulting Company. doi: https://doi.org/10.14689/issn.2148-2624.8c.4s.14m 

[5] Starr, A., & Phillips, L. (1970). Verbal and motor memory in the amnestic syndrome. In Neuropsychologia (Vol. 8, Issue 1, pp. 75–88). doi: https://doi.org/10.1016/0028-3932(70)90027-8

[6] Imreh, G., & Chaffin, R. (1997). Understanding and developing musical memory: The views of a concert pianist and a cognitive psychologist. The American Music Teacher, 46(3), 20-24.

[7] Grahn, J. A., & Brett, M. (2007). Rhythm and Beat Perception in Motor Areas of the Brain. In Journal of Cognitive Neuroscience (Vol. 19, Issue 5, pp. 893–906). MIT Press - Journals. doi: https://doi.org/10.1162/jocn.2007.19.5.893 

[8] Janata, P. (2015). Neural basis of music perception. In The Human Auditory System - Fundamental Organization and Clinical Disorders (pp. 187–205).

[9] Hartson, H. R., & Pyla, P. S. (2019). The UX Book: Agile UX design for a quality user experience.

[10] McCarthy, L.L. and The Processing Foundation (2022). p5.js. url: https://p5js.org/ 

[11] Norman, D. A. (2013b). The Design of Everyday Things. doi: https://doi.org/10.5555/2187809 

[12] Shneiderman, B. (1987). Designing the user interface strategies for effective human-computer interaction. In ACM SIGBIO Newsletter (Vol. 9, Issue 1, p. 6). Association for Computing Machinery (ACM) doi: https://doi.org/10.1145/25065.950626.

[13] Pino, A., Tzemis, E., Ioannou, N., Kouroupetroglou, G. (2013). Using Kinect for 2D and 3D Pointing Tasks: Performance Evaluation. In: Kurosu, M. (eds) Human-Computer Interaction. Interaction modalities and Techniques. HCI 2013. Lecture Notes in Computer Science, vol 8007. Springer, Berlin, Heidelberg. doi: https://doi.org/10.1007/978-3-642-39330-3_38

[14] Kurosu, M., & Kashimura, K. (1995). Apparent usability vs. inherent usability. In Conference companion on Human factors in computing systems – CHI ’95. Conference companion. ACM Press. doi: https://doi.org/10.1145/223355.223680

[15] Sigal, J. (2021). p5.sound. url: https://github.com/processing/p5.js-sound 

[16] Montoya-Moraga, A. & Endoh, K. (2022). p5.serialport. url: https://github.com/p5- serial/p5.serialport

[17] Montoya-Moraga, A. & Van Every, S. (2021). p5.serialserver. url: https://github.com/p5- serial/p5.serialserver

[18] Hattie, J., & Timperley, H. (2007). The power of feedback. Review of educational research, 77(1), 81-112. doi: https://doi.org/10.3102/003465430298487 

[19] Clariana, R.B., Wagner, D. & Roher Murphy, L.C. Applying a connectionist description of feedback timing. ETR&D 48, 5–22 (2000). doi: https://doi.org/10.1007/BF02319855 

[20] Fitts, P. M. (1954). The information capacity of the human motor system in controlling the amplitude of movement. In Journal of Experimental Psychology (Vol. 47, Issue 6, pp. 381–391). American Psychological Association (APA). doi: https://doi.org/10.1037/h0055392 [21] McEwan, M. W., Blackler, A. L., Johnson, D. M., & Wyeth, P. A. (2014). Natural mapping and intuitive interaction in videogames. In Proceedings of the first ACM SIGCHI annual symposium on Computerhuman interaction in play. CHI PLAY ’14: The annual symposium on Computer-Human Interaction in Play. ACM. https://doi.org/10.1145/2658537.2658541 

[22] Chen, J. (2018). KinectV2MouseControl. url: https://github.com/TangoChen/KinectV2MouseControl 

[23] Jamhoury, L. & Van Every, S. (2021). Kinectron. url: https://github.com/kinectron/kinectron

Let's Talk

zhangshiya8888@126.com
sz78@st-andrews.ac.uk

Drop Me a Line

Thanks for submitting! 🙏

bottom of page