Behind the Scenes: The Interactive Music Video Game built in HTML5
Music videos are evolving. What used to be the paradigm of music videos on channels like MTV and VH1, are now being eclipsed by the sheer volume of music videos being released and featured on the web. Unlike traditional media – such as television – where one music video is played after another, music videos on the web compete with one another for attention. So how do artists cut through the noise and get noticed, especially if you’re an up and coming star like Jasmine Villegas?
This is the challenge Jasmine and the team at Internet Explorer sought to solve. Her vision was to create an interactive experience to accomplish two things, 1) engage her existing fan base, and 2) help her reach new fans. To get this done right, Microsoft called in HTML5 experts at creative agencies Digital Kitchen and Bradley and Montgomery to create justafriend.ie The result is an experience that not only meets Jasmine’s objectives, but also showcases the power of HTML5 on the web.
Layered throughout the entire music video are crazy amounts of technical sparklers working to bring it life. Whether you’re Jasmine Villegas’ BFF or just want to learn more about HTML5, we want to provide you with more info on how we brought Just A Friend to life online:
Setting-up the site with HTML5 standards
We started the project by making sure modern browsers will recognize and use HTML5 video instead of plug-ins. This was easy:
Connect Facebook data to your web app
Facebook Connect is an integral component of JustaFriend.ie as it really helps to draw the user into the experience. When you first start the video, users are asked to connect to their Facebook account. This allows us to pull images from their FB account and project them directly into the video and also gives us the ability to display their name contextually throughout the experience. As a result, fans of Jasmine can feel part of the video, alongside their favorite singer.
Project images on an HTML5 canvas and video
Image projection in HTML5 is not a new concept. In fact, much of the early work on the project was based on what Steven Wittens provided at http://acko.net/files/projective-canvas/index.html. But canvas projection in IE9 performed much better in tests than other modern browsers, so the code had to be altered to work cross-platform.
A good example are the steps taken to get the scoreboard to display the user’s name pulled from Facebook Connect. To achieve this, the projection code was updated to work with a canvas buffer instead of an image. This allowed various elements to be composited to the buffer before it was projected. Starting with a blank PNG of the scoreboard, the names were drawn using the canvas drawing API. Another PNG was laid over the top to provide a slight glare. The whole piece was taken a step forward by having this projection matched to frames of video instead of just a motionless, flat surface.
Layering of PNGs was also used in projecting the user’s Facebook photos onto Jasmine’s bedroom wall at the beginning of the video. PNGs provided shadow effects to make the photos blend seamlessly with the existing photos on the wall. This technique was again used on the photo on Jasmine’s desk at the end of the video. Here’s the final experience and the code that powered it.
Using this code source with the techniques described above will produce image projections that blend seamlessly into your project.
Source code for the Scoreboard scene is here: http://www.justafriend.ie/cdn/js/jv/scenes/scoreboardProjection.js
To learn more about developing with HTML5 canvas, try these links:
Pull video frames with pixels
HTML5 video is still an evolving technology. Although a lot of work has been done to ensure frame accuracy, it is still not perfect. So we had to hack it together with some help:
The HTML5 MediaElement API provides a getter for ‘currentTime’ that is supposedly accurate to a fraction of a second, but this does not always match the current frame being rendered to the screen. This is especially a problem when doing computationally expensive processing while the video is being rendered.
To get around this issue, the frame number was pulled out of the image using the black and white pixels that have been burned into the video and placed just off screen. This is not the standard NTSC scan display you see in videos. Instead, it is an original bit of binary data created and burned into the video manually to keep everything correctly synced. While seemingly simplistic in nature it is an incredibly useful technique that is used during the interactive moments in the video. Without it, the videos would not be properly synced which would cause the overlays to not match their corresponding frames.
Use this code source to help keep video and interactive elements in sync. You can get the source for the "frame code reader"module here:
Blowing up PNGs for the good of performance
Get the code sample here: http://haptic-data.com/toxiclibsjs/
In early testing, performance was an issue when too many particles were rendered at once. The simplest solution was to create sprites with lots of transparency within Photoshop. Then, when the particle system is created, only particles are created for those pixels that exceed a certain opacity threshold. By fading out and destroying the particles quickly, better performance was obtained.
An early prototype can be seen here: http://www.justafriend.ie/cdn/dev/proto13b.html
Sequence and media loading
Throughout the experience, there are different outcomes based on user interaction. Because of these different outcomes, a lot of different videos had to be created. So, the media loader needed to work in a way that made the transitions look seamless.
Initially Popcorn.js and the Popcorn.sequence module that was used in the other IE9 project for Cut The Rope was used. However, it was quickly discovered that this framework was overkill for this project and did not give the precision needed for sequencing. For starters, the sequence breaks when using non-integer in and out points, and the timed callbacks are not really in sync.
In the end, a system of scenes that had frame-specific in and out points were created. On each frame, all scenes are rendered that should be visible on that frame. Each scene can be thought of as an overlay, as the HTML5 video is rendered in the background.
Each scene uses the custom media loader to preload the required assets at startup. There are various media loaders for HTML5, but one couldn’t be found that met the project’s specific needs. So, a custom media loader was created that works with RequireJS and supports images, video, and audio. The loader includes support for onComplete and onError callbacks as well as application-level status notifications. All requests are queued and limited to a certain number of simultaneous “threads” to avoid many of the most common HTTP pipeline issues in some browsers, fortunately, not an issue for modern browsers like Internet Explorer 9.
One issue was that HTML5 video clips would not fully preload due to the way modern browsers naturally load segments of video in a progressive manner. To ensure that videos are cached, they are loaded via Ajax-style XHR. The main video is the only one that is allowed to stream in the traditional sense. The main video is the one element that doesn’t change depending on the user’s interactions. If it was preloaded, the initial load time at the start of the experience would be incredibly long. Using this combination of preloaded and progressively loading video elements, traditional buffer times are eliminated.
Adapt this code source to load your assets so that they transition seamlessly.
Source code for the media loader is here: http://www.justafriend.ie/cdn/js/jv/media.js
Dial Your Digits
One of the hidden elements within the experience is the use of the Tropo API that allows you to type in your phone number on a phone lying on the table at the end of the video. By doing so, Jasmine calls the number you dialed and leaves one of 6 random voice messages. The API is easy to implement, and adds a nice layer of surprise and delight to the individual going through the experience.
Wrapping it up
Thanks for reading the Behind the Scenes developer tear down on Jasmine Villegas’ Just A Friend.
To learn more about developing cross-browser code for modern browsers, start with MSDN: http://msdn.microsoft.com/ie
Training and Certifications
All Developer Centers and Hubs
The above uses the match method and a series of regular expressions to determine which event was raised. If the event raised ends with a case-insensitive “down” or “start”, we begin our drag code. If the event ends with a case-insensitive “move”, we perform the actual drag logic itself. And lastly, if the event ends with a case-insensitive “up” or “end”, we end our dragging event. Note: other events may be caught here as well, like onresizeend and keyup. Be sure to consider this in your project.
The above is an implementation of Ted Johnson’s solution in Handling Multi-touch and Mouse Input in All Browsers.
The drag logic itself initially relies upon the event.targetTouches TouchList. This member does not exist in Internet Explorer. The drag logic attempts to gather the pageX and pageY properties from the first item in the TouchList, however in Internet Explorer these values are found directly on the event object.