VoIP apps for Windows Phone 8

[ This article is for Windows Phone 8 developers. If you’re developing for Windows 10, see the latest documentation. ]

With Windows Phone 8 you can create apps that implement voice over IP (VoIP), and which gives a user the ability to engage in video or audio calls over the phone’s network connection. When the user installs your VoIP app, the app shows up in the user’s App list like any other app. However, when an incoming call arrives for a VoIP app, the built-in phone experience is shown, and the VoIP app appears integrated into the phone.

Note

To use VoIP, you must set the ID_CAP_VOIP capability in the app manifest. If you do not set this capability, your app might not work correctly. For more information, see App capabilities and hardware requirements for Windows Phone 8. As with all Windows Phone apps, be sure to review the applicable certification requirements before you start a new project. For requirements related to VoIP apps, see Additional requirements for specific app types for Windows Phone.

This topic contains the following sections.

VoIP app architecture

A Windows Phone VoIP app is comprised of several components that run in one of two processes, and one component that runs outside of any process. The first process is the foreground process that displays the app UI. The second process is a background process that does most of the work of creating, connecting, and managing incoming and outgoing calls. The following diagram shows the components of a VoIP app.

The following list provides a brief description of each of the VoIP app components.

  • Foreground app

    The component that provides the UI for your app. It appears in the phone App list like any other Windows Phone app, and you can pin to the Start screen as an app Tile. The component runs in the foreground process, also just like any other app. In addition to providing the UI, the foreground app sets up the push notification channel on which incoming calls arrive. It also launches the background process and uses the out-of-process server to pass commands to the components in the background, such as requesting that a call be ended.

  • Out-of-process server

    The server that the foreground app and the background components use to communicate between processes.

  • Background agents

    The four background agents that a VoIP app uses. These agents are written using managed code and are launched to indicate that a new phase of a VoIP call has begun. In general, these agents have very little code, and just pass the state of the call in to one of the following components that do most of the work.

    • VoipHttpIncomingCallTask. Launched when a new incoming call arrives on the push notification channel. It lets the Windows Phone Runtime assembly know that it should create a new call.

    • VoipForegroundLifetimeAgent. Launched by the foreground app and runs as long as the app is in the foreground. It bootstraps the background process and keeps its process alive so that outgoing calls can be created quickly.

    • VoipCallInProgressAgent. Launched when a call becomes active. It signals the app that it has been allocated more CPU cycles so it can encode and decode audio and video.

    • VoipKeepAliveTask. Runs periodically, every 6 hours by default, whether or not the foreground app is running. This gives the app an opportunity to ping the VoIP service to indicate that the app is still installed on the device. When the user uninstalls the app, this agent no longer runs and the service knows not to send incoming call on the app’s push notification channel.

  • Windows Phone Runtime assembly

    An assembly that does most of the work of connecting and managing VoIP calls using the VoipCallCoordinator and VoipPhoneCall objects. Because this is a Windows Phone Runtime assembly instead of managed, it can call the native APIs to use for audio and video processing.

  • Native core assembly

    Many VoIP app developers support multiple platforms, and often have a core library written in C or C++. This library can be used in a Windows Phone VoIP app to support code reuse and speed app development.

  • VoIP cloud service

    The VoIP service that runs remotely.

Walkthrough of an incoming VoIP call

  1. The first time the user launches your VoIP app, you create a push notification channel. For more information about setting up a push notification channel, see Push notifications for Windows Phone 8.

  2. To initiate a new incoming call, the VoIP cloud service sends a push notification to the URI for your app’s push notification channel.

  3. The operating system launches your app’s VoipHttpIncomingCallTask agent.

    While the Incoming Call Agent is running:

    1. The incoming call agent loads the app’s Windows Phone Runtime assembly, if it’s not already loaded.

    2. The incoming call agent gets the push notification payload from its MessageBody property, and then passes it in to the Windows Phone Runtime assembly, calling a custom method named something like OnIncomingCallReceived.

    3. In OnIncomingCallReceived, your Windows Phone Runtime assembly VoipCallCoordinator.RequestNewIncomingCall, passing in information from the push notification payload such as the calling party’s name and the URI of a contact picture for the caller. This information allows the phone to display the built-in phone UI to the user.

    4. The VoipCallCoordinator.RequestNewIncomingCall method returns a VoipPhoneCall object that represents the requested call. Your Windows Phone Runtime assembly should store this object so that it can be used from other methods, and then register handlers for the call object’s AnswerRequested and RejectRequested events.

    5. In the AnswerRequested event handler, you can contact your cloud service to let it know that a new call has begun, for example, to indicate that billing for the call should begin. Then, you should call NotifyCallActive to let the system know that the call is now active.

    6. If the RejectRequested event handler is called, you know that the call has been rejected and you have 10 seconds to clean up any resources before the operating system will terminate your process, assuming one of the other VoIP background agents isn’t still running.

  4. After you have called NotifyCallActive, the operating system will launch your VoipCallInProgressAgent. When this agent runs, you know that your background process has been allocated more CPU for audio encoding and decoding. While the VoipCallInProgressAgent is running, do the following:

    1. Load your Windows Phone Runtime assembly, if it’s not already loaded.

    2. In the Windows Phone Runtime assembly, if you haven’t already done so, register handlers for the MuteRequested, UnmuteRequested, and AudioEndpointChanged events.

    3. Register event handlers for the EndRequested, HoldRequested, ResumeRequested events of the VoipPhoneCall object you stored while the incoming call agent was running. If you want to, you can register for all of the VoipPhoneCall events when the object is first created.

    4. Hook up your incoming video stream to the MediaElement in your foreground app. For a detailed walkthrough of setting up your video stream, see How to implement video streaming for VoIP calls for Windows Phone 8.

    5. As your Windows Phone Runtime assembly receives events for changes in the call state, such as HoldRequested or ResumeRequested, take the appropriate action, such as suspending or resuming the streaming of video.

    6. When the user presses a button in your UI to end the call, your foreground app calls a method in your Windows Phone Runtime assembly to end the call, which will set the call object to null and then call NotifyCallEnded. You should also do this in EndRequested which is raised when the operating system ends the call, such as when the user answers a VoIP call from a different VoIP app or if the user answers a second cellular call while already on a cellular call and a VoIP call.

Walkthrough of an outgoing VoIP call

  1. When your foreground app is launched:

    1. Call Launch()()() to launch your background process. This causes the operating system to spawn the background process for your app and call the OnLaunched()()() method of the VoipForegroundLifetimeAgent agent. The foreground lifetime agent will be kept alive as long as your app runs in the foreground.

    2. From the VoipForegroundLifetimeAgent, load your Windows Phone Runtime assembly if it is not already loaded.

    3. Use custom events to let the foreground app know that the background process is ready, and that all assemblies have been loaded. Then, create a reference to your Windows Phone Runtime assembly in the foreground app.

  2. After the user selects a contact to call and then presses the Call button in your app:

    1. Call a custom method on your Windows Phone Runtime assembly, named something like MakeOutgoingCall.

    2. In MakeOutgoingCall, call the RequestNewOutgoingCall method and then pass the name of the person being called so that it can be displayed in the minimized phone UI, and whether the call will use only audio, or audio and video.

  3. The RequestNewOutgoingCall method returns a VoipPhoneCall object that represents the requested call. Your Windows Phone Runtime assembly should store this object so that it can be used from other methods, and then it registers handlers for the call object’s AnswerRequested and RejectRequested events.

  4. You should call NotifyCallActive to let the system know that the call is now active.

  5. The operating system will launch VoipCallInProgressAgent. When this agent runs, you know that your background process has been allocated more CPU for audio encoding and decoding. While the VoipCallInProgressAgent is running:

    1. Load your Windows Phone Runtime assembly, if it’s not already loaded.

    2. In the Windows Phone Runtime assembly, if you haven’t already done so, register handlers for the MuteRequested, UnmuteRequested, and AudioEndpointChanged events.

    3. Register event handlers for the EndRequested, HoldRequested, ResumeRequested events of the VoipPhoneCall object that you stored while the incoming call agent was running. If you want to, you can register for all of the VoipPhoneCall events when the object is first created.

    4. Hook up your incoming video stream to the MediaElement in your foreground app. For a detailed walkthrough of setting up your video stream, see How to implement video streaming for VoIP calls for Windows Phone 8.

    5. When the user presses a button in your UI to end the call, your foreground app calls a method in your Windows Phone Runtime assembly to end the call, which should set the call object to null and then call NotifyCallEnded. You should also do this in EndRequested which is raised when the operating system ends the call, such as when the user answers a VoIP call from a different VoIP app or if the user answers a second cellular call while already on a cellular call and a VoIP call.

VoIP app sample

To view a VoIP sample app that demonstrates the use of the VoIP APIs and allows you to simulate incoming and outgoing VoIP calls using loopback audio and video, see ChatterBox VoIP Sample App.