The touch keyboard

[ This article is for Windows 8.x and Windows Phone 8.x developers writing Windows Runtime apps. If you’re developing for Windows 10, see the latest documentation ]

February 2, 2012

This paper provides information about the invocation and dismissal behaviors of the touch keyboard for Windows operating systems. It provides guidelines for developers to understand how the touch keyboard shows and hides itself.

This information applies to the following operating systems:

  • Windows 8
  • Windows Server 2012

Disclaimer: This document is provided "as-is". Information and views expressed in this document, including URL and other Internet website references, may change without notice. You bear the risk of using it.

This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes.

©2012 Microsoft. All rights reserved.

See this feature in action as part of our App features, start to finish series: User interaction: Touch input... and beyond.

Overview

The Windows 8 touch keyboard is a system component that enables users of touch devices to input text. The touch keyboard is present when users expect to input text, and it is out of the way when they don't. Our invocation and dismissal model is designed to make sure that this expectation applies consistently across the system, regardless of what app the user is using at any particular time. The Touch Keyboard reacts to UI accessibility properties as a richly defined means to determine whether a focused UI element is meant to receive text input. In addition to encouraging good accessibility practices, this approach allows the Touch Keyboard to define a specific set of rules for invocation and dismissal that are rooted in the UI elements that users will interact with.

The invocation and dismissal behaviors of the touch Keyboard are driven in the immersive environment by three inputs:

  • Accessibility properties from UI Automation (UIA)
  • User tap
  • Focus changes

UI Automation is the mechanism through which developers communicate whether or not a particular UI element can receive text input. You must ensure that the appropriate accessibility properties are set in your apps so that the touch keyboard will know to appear when focus lands on a specific UI element. For Windows-provided controls, this will be done automatically because proper accessibility properties are set by default, but for custom controls and experiences you must do additional work to set the accessibility properties correctly; remember that the touch keyboard reacts to these properties. The keyboard appears when the user touches the focused input field. If individual apps set focus programmatically, this action does not invoke the touch keyboard. This is because the user should never be surprised by the keyboard coming up. The keyboard automatically hides itself in response to programmatic focus shifts, however, so if a user has completed a text entry flow and the application has moved focus elsewhere, the keyboard disappears.

Invocation and dismissal logic

The touch keyboard makes use of UIA as a pre-existing and well-defined means to determine whether a focused UI element is intended for text entry and should cause the keyboard to appear if it was previously hidden. UIA is also the means to determine whether a focused UI element is not meant to be interacted with by the touch keyboard and thus to dismiss the keyboard if it was showing. UIA is exposed directly through .NET for XAML and can be accessed via Accessible Rich Internet Applications (ARIA) for WWAHost.exe. Common controls provided to developers by Windows have the appropriate UIA properties already set, but developers with custom controls need to set accessibility properties directly. The touch keyboard can be in one of two states at any particular time: Shown or Hidden. Also, on any given focus change, Windows can change the visibility state of the keyboard or leave it unchanged.

The touch keyboard determines whether a focused UI element is intended for text entry by checking to see whether it has a UIA TextPattern (or a TextChildPattern) control pattern and is editable. If a focused element fails to satisfy either of those conditions, the touch keyboard does not show itself and hides itself if it was showing.

There is also a set of controls that can receive focus during a text entry flow but that aren't editable (with the exception of a combo box, which is a combination of an edit field and a list, with focus moving between the two). Rather than needlessly churn the UI and potentially disorient the user in the middle of a flow, the touch keyboard remains in view on controls because the user is likely go back and forth between the controls and text entry with the touch keyboard. Therefore, if the keyboard is already showing and focus lands on a control of one of the types in the following list, the keyboard will not hide itself. However, if the keyboard was not already showing, it will not show itself unless the tap lands on an editable region. As an example, imagine a user is typing an email message in an app that uses an app bar to host its text-editing controls, such as bold or italic. The user wants to make the next word bold and so taps the "bold" button. Because Menu Bar is on the list of controls for which the touch keyboard persist, the keyboard doesn't hide itself even though the user tapped outside of an input field. This behavior enables the user to go right back to typing.

These are the controls in question:

  • Check box
  • Combo box
  • Radio button
  • Scroll bar
  • Tree
  • Tree item
  • Menu
  • Menu bar
  • Menu item
  • Toolbar
  • List
  • List item

Setting the right accessibility properties

Reminder: This section is relevant only if a you are using a custom control. The common controls provided by Windows will not need any additional work to function properly with the touch keyboard.

If you're an app developers who is using a custom control, ensure that the control has the proper accessibility information to get touch keyboard behavior. After you've done this, the keyboard will show and hide itself in line with the expected user model.

In HTML5 this is simply a matter of setting the right ARIA property on the control: role="textbox". Of course, the control must also be editable, so setting contentEditable="true" is advisable. As an example, the following creates a content-editable div that will invoke the touch keyboard when tapped by the user.

<div contentEditable="true" role="textbox">

If you use C# or C++, use an AutomationPeer object, and specifically a TextAutomationPeer. A Windows 8 sample will demonstrate how to do this in C#.

Remember that the control must also be editable and able to receive text to get the keyboard to invoke, in addition to having the appropriate accessibility settings. Indicating that something can receive text when it cannot will mislead accessibility tools and the users who rely on them.

User-driven invocation

The invocation model of the touch keyboard is designed to put the user in control of the keyboard. Users indicate to the system that they want to input text by tapping on an input control instead of having an application make that decision on their behalf. This reduces to zero the scenarios where the keyboard is invoked unexpectedly, which can be a painful source of UI churn because the keyboard can consume up to 50% of the screen and mar the application's user experience. To enable user-driven invocation, we track the coordinates of the last touch event and compare them to the location of the bounding rectangle of the element that currently has focus. If the point is contained within the bounding rectangle, the touch keyboard is invoked.

This means that applications cannot programmatically invoke the touch keyboard via manipulation of focus. Big culprits here in the past have been webpages—many of them set focus by default into an input field but have many other experiences available on their page for the user to enjoy. A good example of this is msn.com. The website has a lot of content for consumption, but happens to have a Bing search bar on the top of its page that takes focus by default. If the keyboard were automatically invoked, all of the articles located below that search bar would be occluded by default, thus ruining the website's experience on tablets. There are certain scenarios where it doesn't feel great to have to tap to get the keyboard, such as when a user has started a new email message or has opened the Search pane. However, we feel that requiring the user to tap the input field is an acceptable compromise.