Wordament, part III
Running a business
By: Sidney Higa and Walter Poupore
Recap of Wordament parts I and II
In Wordament part I and part II, we introduced Jason Cahill and John Thornton and their smash game Wordament. We took a tour of the architecture of the game and then examined the backend processes for lessons that can be taken away. They started out building Wordament for fun. As its popularity grew, they responded by increasing its capacity and quality. In the process, they learned other valuable lessons.
Thousands of players are connected and playing at any given moment. How does the Wordament architecture handle that load?
The cache in the aggregator is essential for performance. It contains the game results of hundreds of files, as well as leaderboard data, and any other data.
The chances are good that any request for data can be filled by searching the cache first, which is much quicker than querying the permanent storage.
Another strategy for quick data retrieval is to use the PartitionKey of Azure tables. Every table has a partition key that can be used to break down a huge dataset into much smaller chunks. For example, you could use the partition key with the string “Azure.” Any record in the table that begins with "Azure" would then immediately be placed in a block with other such records. So “Azure color” and “Azure ocean” would both end up in the same partition. Wordament does this: for every name, it creates 16 partition keys. If a person’s name is “Brookshire,” then the first few partition keys look like the diagram in the PartitionKey graphic.
The unique ID consists of both the partition key and a row key. Under each partition are the rows that are identified with a RowKey. In Wordament's design, when only a few characters of a name are used, a partition that contains many rows that match the small amount of characters provides the results. As more characters are added to the name, different partitions that contain fewer rows (since fewer names start with the longer set of characters) provide the results.
Real-time testing with synthetic transactions
Testing is critical to any ongoing concern. Wordament relies on the use of synthetic transactions. That is the simulation of a player downloading a game, and after 2 minutes uploading the faked results and finally receiving the new leaderboard results and a new game. Synthetic transactions are enabled by a command-line application that constantly "plays" Wordament games at the same time the rest of the Internet is playing, by calling the same APIs used by the Wordament client apps. There’s a synthetic transaction playing in each instance of a Game Room. After playing a game, the results for all instances of the GR are compared.
If the results are not consistent across a GR, that is an indication that an instance of the GR has an issue. Any issues are raised to the Wordament team by an email (the machine running the synthetic transactions has an email account).
To allow the synthetic transaction application to access each GR instance, the team relies on Azure instance input endpoints, which are designated using the <InstanceInputEndpoint> element in the worker role’s XML configuration file. An input endpoint within an Azure role enables communications from outside of Azure. A regular input endpoint communicates with the load balancer, whereas an instance input endpoint communicates with a specific instance. As mentioned, the team wants to monitor each instance and then compare results - instance input endpoints provide them with a way to do it. Additional information about input endpoints can be found at Enable Communication for Role Instances in Azure.
Wordament gamers, don’t need to worry about these synthetic transactions skewing their Wordament rankings—the results do not get rolled into the cumulative game results.
Testing the limit with load testing
Another critical task is to simulate thousands of users on the system—also known as load testing. Visual Studio features test tools that allow you to run load tests using Azure cloud services. Each medium role in a cloud service can generate 500 transactions a second. Recall that the game is most stressed in the 17 seconds after a game ends. That means 17 X 500 = 8,500 transactions by one role. The team uses 10 roles, meaning 85,000 transactions during the critical time—20 times their norm.
To run these special tests, the team mounts a duplicate of one of their Game Rooms, then runs the load test. Note that each GR consists of either two or three medium size worker roles, depending on the number of concurrent players. A question that arises is this: why should they pay for so much extra capacity? The answer lies in their experience with all sizes of worker roles. For their needs, a small instance is just too small. So as long as Wordament doesn’t come near its limit, they will use the minimum size (medium). And since having so much extra capacity is also affordable—why not err on the side of caution?
For any project that involves many people, minimizing friction is a factor to succeeding. To that end, the Wordament team has standardized not only the language and coding environment, they also use a commercial Visual Studio add-in named ReSharper, along with StyleCop.
ReSharper and StyleCop work hand-in-hand to ensure that all code looks the same and follows the team’s coding conventions, for example, by not allowing check-ins if brackets are formatted "wrongly."
The team also heavily relies on Skype, including Skype rooms, for communications. They’ve found this keeps everyone connected, and meetings are efficient. Team members pick up on what is happening, and what needs attention, regardless of where and when they’re working.
Automate, control, and improve continuously
Microsoft vice president Scott Guthrie (who runs the Azure team) has spoken of three habits that help to make an Azure app successful: Automate everything. Use a source control system. Continuously integrate and deploy. The Wordament team uses all three.
Wordament is built and released every day. All code is stored using Visual Studio Online. PowerShell runs the entire deployment process.
Another interesting topic is game generation—a complex topic, but we’ll only touch the surface here. Each game is unique. And each game has to be "good," which means it contains an interesting range of words, with good scoring possibilities, and are a challenge to the best players. As a measure of how complex this task is, for every 10,000 games produced algorithmically, only one is acceptable. So once a month the team spins up 40 Azure worker roles to generate games for eight hours. The outcome is another month’s worth of games. This is done for all the languages that Wordament supports.
String 'em up
Coding Alert! Each Game Room web role uses a single instance of the StringBuilder class. Allocating large objects on the heap is a performance hit. To mitigate this hit, the GR code makes this allocation only once per Azure role instance. That is, when the aggregator instance is initially brought online, it creates a new StringBuilder object set to the size needed (for example, one megabyte).
This object is used to construct the CSV data that is then sent to the aggregator, and later written to blob storage. After being used to write to blob storage, the StringBuilder object is cleared of its contents and reused again (instead of the StringBuilder object being deleted and then recreated), thereby avoiding the heap allocation perf hit. As the GR web role receives results, it uses the StringBuilder.AppendFormat method to add the CSV results for the WUID, gamer ID, score, number of words found, etc. To send the results to the aggregator, the code dumps the contents using the StringBuilder.ToString method. Because multiple threads are handling the incoming requests to each GR web role, access to the StringBuilder instance must be coordinated. The pattern is to lock the StringBuilder instance, use it, and then release it. More information about this pattern can be found at Thread Synchronization.
Modern devices are also simple to use, but challenging when it comes to providing highly engaging, seamless, and robust experiences. Wordament’s global simultaneous play, short-lived rounds, timely personal and leaderboard results, and frenemy capabilities make it simple for the user to enjoy. To handle the challenges of cross-platform support, network latency, connectivity quality, responsiveness, real-time global reach, and rapid data exchange, the Wordament team relies on products such as C#, Azure cloud services and storage, Xamarin, Visual Studio, and SQL Azure (to name just a few). These tools allow the Wordament team to remain agile; as they noted, "The things we use are available to customers."
Additionally, by offloading the IT support of their infrastructure to Azure, the Wordament team can focus on what’s important to them: Providing the best possible experience for gamers. As Jason and John said, "Azure has been good to us. We’ve been able to grow."
Whether you’re building games or line of business apps, moving infrastructure to the cloud, or considering what it takes to make an outstanding modern app experience, we hope you enjoyed this insight into the Wordament architecture.
The experience of building a winning game on Azure from absolute nothing was a learning experience for Jason and John.
When asked if there were any lasting lessons, they recommended:
- Creating a domain name.
- Trademarking your app’s name and web domain name.
- Checking for infringement issues with a U.S. trademark search.
- Incorporating, for example by forming a limited liability company (LLC).