Capture web page snapshots with the OneNote API

Learn how to use the Microsoft OneNote API to snapshot entire web pages, or thumbnail large images and insert them into new pages in the user's OneNote notebooks.

Last modified: January 22, 2016

Applies to: OneNote service

In this article
Capture web page snapshots in an Android app
Capture web page snapshots in an iOS app
Capture web page snapshots in a Windows Phone app
Capture web page snapshots in a Windows Store app
Capture web page snapshots using REST

Note Note

See this content on our new documentation site for consumer and enterprise OneNote APIs.

When you need to capture a complex web page where the HTML uses features that OneNote has no corresponding feature for, like CSS, or you need to capture a how a web site looked on that day, you can use the OneNote API to capture that web page as a snapshot image. When you capture a web page or image this way, the API basically renders it in a web browser "in the cloud", and then takes a screenshot of the page. The resulting image is then inserted into the OneNote page. This lets your users do things like capture how a web page looks at particular moment, archiving it or saving it for review later.

To insert an image from a web page, you include an <img…> tag in the captured HTML, but instead of using the src attribute, you use the data-render-src attribute, as shown in the following code.


<!DOCTYPE html>
<html>
  <head>
    <title>Today's cool Bing image</title>
  </head>
  <body>
    <!-- inserts a 200-pixel wide thumbnail of today's Bing homepage -->
    <p>Today's Bing image is amazing!</p>
    <img data-render-src="http://www.bing.com" width="200"/>
  </body>
</html>

The data-render-src attribute can take two forms:

  • <img data-render-src="http://www.example.com" ... /> where the attribute value is an internet URL. The URL can be either a web page or another image, but it does have to be publicly available without a password. This is great for archiving things you see on web sites that change frequently.

  • <img data-render-src="name:MultiPartBlockName" ... /> where the attribute value specifies the name of a data block in the POST request. The content-type of that named block tells the API whether it's a block of HTML to render in a browser (content-type=text/html), or that the block is an image (content-type=image/jpeg or similar). This method of capturing HTML is useful when the web page is more-complex than the OneNote page can faithfully render, or when the page requires login credentials. It is important to know that the HTML in the non-presentation block cannot use the data-render-src attribute.

TipTip

When you pass the HTML as a data-block, be sure there is no active content that would require user credentials, or a pre-loaded browser plug-in. The engine that the OneNote API uses to render the HTML page into an image has no ability to log in a user, and doesn't include plug-ins like Adobe Flash, Apple QuickTime, and so-on. That also means that dynamically-loaded content, like might come with an AJAX script, won't appear if getting the data requires user login credentials or cookies.

When trying to decide between directly putting HTML onto the OneNote page, versus using the data-render-src rendering, consider the following:

  • Complex HTML is probably best sent to the rendering engine via data-render-src, rather than attempting to modify the HTML to fit into what the OneNote API can accept. This is also true when your HTML includes tags not yet supported (see https://msdn.microsoft.com/en-us/library/office/dn575442.aspx).

  • Accurate page rendering to preserve the layout and look of the page is probably best done with the rendering engine via data-render-src.

  • Directly-editable text is often best done with inserting the HTML directly onto the page. The rendered images are scanned by an optical character recognition (OCR) system, but it's just not the same.

  • Snapshot-in-time for historical or archival purposes is usually best done with the data-render-src method.

  • Marking-up a web page design for revisions is one place the data-render-src truly shines. Using OneNote's inking capabilities, you can draw on the image to indicate changes or call out important areas. Having the web page as an image makes that a lot easier.

  • Very large images, or images in formats that OneNote doesn't directly accept can sometimes be thumbnailed and converted with the data-render-src method more easily than by doing it in your own code.

The important thing in some cases is to try it both ways as you develop your app and see which works best for your users.

Important noteImportant

Before POST requests like the ones shown here can succeed, you need to Get a client ID for use with the OneNote API (or package ID for a Windows Store application), and your app has to Authenticate the user for the OneNote API. If you don't supply a valid OAuth token with your request, it will fail.

TipTip

As with most code samples in documentation, these code examples should not be considered production-ready code. Things like detailed user-input validation have been left out to make it easier to understand the code flow. Be sure to carefully review your code for potential code-quality and security issues before you publish your app.

The following sections contain code snippets showing how to use the data-render-src attribute. They use third-party libraries to build the multi-part requests, which you can find more information about here: Get tools and libraries to use with the OneNote API

The following code builds a simple multi-part POST request with some HTML in a "Presentation" block. In that HTML is an <img> tag that shows how to specify an Internet URL to the page rendering engine.

The createPageWithUrlScreenShot function is part of the SendPageCreateAsyncTask class in the Android sample on Github. For more information, see Get the OneNote API sample applications.


private ApiResponse createPageWithUrlScreenShot() {
  try {
    this.connectForMultipart(PAGES_ENDPOINT);
    String date = getDate();
    String simpleHtml = 
      "<html>" +
      "  <head>" +
      "    <title>A Page Created With a URL Snapshot (Android Sample)</title>" +
      "    <meta name=\"created\" content=\"" + date + "\" />" +
      "  </head>" +
      "  <body>" +
      "    <p>This is a page with an image of an HTML page " +
      "      rendered from a URL on it.</p>" +
      "    <img data-render-src=\"http://www.onenote.com\" />" +
      "  </body>" +
      "</html>";

    this.addFormPart("presentation", "application/xhtml+xml", simpleHtml);
    this.finishMultipart();
    ApiResponse response = this.getResponse();
    return response;
  } catch (Exception ex) {
    String errorMessage = ex.getMessage();
  }
  return null;
}

The next code sample creates a more-complex multi-part POST request that includes HTML in both a "Presentation" block as well as a second block named "embedded1". The embedded1 block HTML is sent to the rendering engine, and the API inserts the resulting image into the new OneNote page. This code is modified from of the SendPageCreateAsyncTask class in the Android sample on Github.


private ApiResponse createPageWithHTMLScreenshot() {
  try {
    this.connectForMultipart(PAGES_ENDPOINT);
    String embeddedPartName = "embedded1";
    String date = getDate();
    String pageHtml = 
      "<html>" +
      "  <head>" +
      "    <title>A Page Created With Snapshot of Webpage (Android Sample)</title>" +
      "    <meta name=\"created\" content=\"" + date + "\" />" +
      "  </head>" +
      "  <body>" +
      "    <h1>This is a page with an image of an HTML page.</h1>" +
      "    <img data-render-src=\"name:" + embeddedPartName + "\" />" +
      "  </body>" +
      "</html>";

    String embeddedWebPage =
      "<html>" +
      "  <head>" +
      "    <title>Embedded HTML</title>" +
      "  </head>" +
      "  <body>" +
      "    <h1>This is a snapshot of a web page</h1>" +
      "    <p>...containing a list:</p>" +
      "    <ul>" + 
      "      <li>List Item One</li>" +
      "      <li>List Item Two</li>" +
      "    </ul>" +
"  </body>" +
"</html>";

    this.addFormPart("presentation", "application/xhtml+xml", pageHtml);
    this.addFormPart(embeddedPartName, "text/html", embeddedWebPage );
    this.finishMultipart();
    ApiResponse response = this.getResponse();
    return response;

  } catch (Exception ex) {
    String errorMessage = ex.getMessage();
  }
  return null;
}

The following code builds a simple multi-part POST request with some HTML in a "Presentation" block. In that HTML is an <img> tag that shows how to specify an internet URL to the page rendering engine.

The createPageWithUrl function is part of the ONSCPSCreateExamples class in the iOS sample on Github. For more information, see Get the OneNote API sample applications.


- (void)createPageWithUrl {
  NSString *date = [ONSCPSCreateExamples getDate];
  NSString *simpleHtml = [NSString stringWithFormat:
    @"<html>"
    "  <head>"
    "    <title>A page created with an image from a URL (iOS sample)</title>"
    "    <meta name=\"created\" content=\"%@\" />"
    "  </head>"
    "  <body>"
    "    <p>This is a page with an image of an HTML page rendered from a URL.</p>"
    "    <img data-render-src=\"http://www.onenote.com\"/>"
    "  </body>"
    "</html>", date];
    
    NSData *presentation = [simpleHtml dataUsingEncoding:NSUTF8StringEncoding];
    NSMutableURLRequest *request = 
      [[AFHTTPRequestSerializer serializer] 
      multipartFormRequestWithMethod:@"POST" 
      URLString:PagesEndPoint parameters:nil 
      constructingBodyWithBlock: ^(id <AFMultipartFormData>formData) {
        [formData appendPartWithHeaders:@{
            @"Content-Disposition" : @"form-data; name=\"Presentation\"",
            @"Content-Type" : @"text/html"}
          body:presentation];
        }];
    
    if (liveClient.session)
    {
      [request setValue:[
          @"Bearer " stringByAppendingString:liveClient.session.accessToken]     
        forHTTPHeaderField:@"Authorization"];
    }
    currentConnection = [[NSURLConnection alloc] 
      initWithRequest:request delegate:self startImmediately:YES];
}

The next code snippet creates a more-complex multi-part POST request that includes HTML in both a "Presentation" block as well as a second block named "embedded1". The embedded1 block's HTML is sent to the rendering engine, and the API inserts the resulting image into the new OneNote page. This code is modified from of the ONSCPSCreateExamples class in the iOS sample on Github.


- (void)createPageWithEmbeddedWebPage {
  NSString *date = [ONSCPSCreateExamples getDate];
  NSString *simpleHtml = [NSString stringWithFormat:
    @"<html>"
    "  <head>"
    "    <title>A page created with an image of an HTML page (iOS sample)</title>"
    "    <meta name=\"created\" content=\"%@\" />"
    "  </head>"
    "  <body>"
    "    <h1>This is a page with an image of an HTML page.</h1>"
    "    <img data-render-src=\"name:embedded1\"/>"
    "  </body>"
    "</html>", date];

  NSString *embeddedWebPage =
    @"<html>"
    "  <head>"
    "    <title>Embedded HTML</title>"
    "  </head>"
    "  <body>"
    "    <h1>This is a snapshot of a web page</h1>"
    "    <p>...containing a list:</p>"
    "    <ul>"
    "      <li>List Item One</li>"
    "      <li>List Item Two</li>"
    "    </ul>"
    "  </body>"
    "</html>";
    
    NSData *presentation = [simpleHtml dataUsingEncoding:NSUTF8StringEncoding];
    NSData *embedded1 = [embeddedWebPage dataUsingEncoding:NSUTF8StringEncoding];
    
    NSMutableURLRequest *request = 
      [
        [AFHTTPRequestSerializer serializer] 
          multipartFormRequestWithMethod:@"POST"
          URLString:PagesEndPoint parameters:nil 
          constructingBodyWithBlock: 
          ^(id <AFMultipartFormData>formData) {
           [formData appendPartWithHeaders:@{
             @"Content-Disposition" : @"form-data; name=\"Presentation\"",
             @"Content-Type" : @"text/html"}
             body:presentation];
           [formData appendPartWithHeaders:@{
             @"Content-Disposition" : @"form-data; name=\"embedded1\"",
             @"Content-Type" : @"text/html"}
             body:embedded1];
          }
      ];
    if (liveClient.session)
    {
      [request setValue:[@"Bearer " 
        stringByAppendingString:liveClient.session.accessToken] 
        forHTTPHeaderField:@"Authorization"];
    }
    currentConnection = 
      [[NSURLConnection alloc] initWithRequest:request 
        delegate:self startImmediately:YES];
}

The following code builds a simple (non-multi-part) POST request with some HTML in a "Presentation" block. In that HTML is an <img> tag that shows how to specify an internet URL to the page rendering engine.

This code is modified from the Windows Phone sample on Github. For more information, see Get the OneNote API sample applications.


private async void btn_CreateWithUrl_Click(object sender, RoutedEventArgs e)
{
  StartRequest();
  var client = new HttpClient();

  // Note: API only supports JSON return type.
  client.DefaultRequestHeaders.Accept.Add(
    new MediaTypeWithQualityHeaderValue("application/json"));

  // This allows you to see what happens when an unauthenticated call is made.
  if (m_AccessToken != null)
    {
      client.DefaultRequestHeaders.Authorization = 
        new AuthenticationHeaderValue("Bearer", m_AccessToken);
    }

  string date = GetDate();
  string simpleHtml = 
    "<html>" +
    "  <head>" +
    "    <title>A Page Created With a URL Snapshot (WinPhone Sample)</title>" +
    "    <meta name=\"created\" content=\"" + date + "\" />" +
    "  </head>" +
    "  <body>" +
    "    <p>This is a page with an image of an html page rendered from a URL.</p>" +
    "    <img data-render-src=\"http://www.onenote.com\" />" +
    "  </body>" +
    "</html>";

  var createMessage = new HttpRequestMessage(HttpMethod.Post, PAGESENDPOINT)
    {
      Content = new StringContent(simpleHtml, System.Text.Encoding.UTF8, "text/html")
    };

  HttpResponseMessage response = await client.SendAsync(createMessage);
  await EndRequest(response);
}

The next code sample creates a more-complex multi-part POST request that includes HTML in both a "Presentation" block as well as a second block named "embedded1". The embedded1 block HTML is sent to the rendering engine, and the API inserts the resulting image into the new OneNote page. This code is modified from of the Windows Phone sample on Github.


private async void btn_CreateWithHtml_Click(object sender, RoutedEventArgs e)
{
  StartRequest();
  var client = new HttpClient();
  // Note: API only supports JSON return type.
  client.DefaultRequestHeaders.Accept.Add(
    new MediaTypeWithQualityHeaderValue("application/json"));

  // This allows you to see what happens when an unauthenticated call is made.
  if (m_AccessToken != null)
    {
      client.DefaultRequestHeaders.Authorization = 
      new AuthenticationHeaderValue("Bearer", m_AccessToken);
    }
  string date = GetDate();
  string simpleHtml = 
    "<html>" +
    "  <head>" +
    "    <title>A Page Created With Snapshot of web page (WinPhone Sample)</title>" +
    "    <meta name=\"created\" content=\"" + date + "\" />" +
    "  </head>" +
    "  <body>" +
    "    <h1> This is a page with an image of an HTML page.</h1>" +
    "    <img data-render-src=\"name:" + embeddedPartName + "\" />" +
    "  </body>" +
    "</html>";

  const string embeddedPartName = "embedded1";
  const string embeddedWebPage =
    "<html>" +
    "  <head>" +
    "    <title>Embedded HTML</title>" +
    "  </head>" +
    "  <body>" +
    "    <h1>This is a snapshot of a web page</h1>" +
    "    <p>...containing a list: </p>" +
    "    <ul>" + 
    "      <li>List Item One</li>" +
    "      <li>List Item Two</li>" +
    "    </ul>" +
    "  </body>" +
    "</html>";
  var createMessage = new HttpRequestMessage(HttpMethod.Post, PAGESENDPOINT)
    {
      Content = new MultipartFormDataContent
        {
          {new StringContent(simpleHtml, 
            System.Text.Encoding.UTF8, "text/html"), "Presentation"},
          {new StringContent(embeddedWebPage, 
            System.Text.Encoding.UTF8, "text/html"), embeddedPartName}
        }
    };
  HttpResponseMessage response = 
    await client.SendAsync(createMessage);
  await EndRequest(response);
}


The following code builds a simple (non-multi-part) POST request with some HTML in a "Presentation" block. In that HTML is an <img> tag that shows how to specify an internet URL to the page rendering engine.

This snippet is modified from the Windows Store sample on Github. For more information, see Get the OneNote API sample applications.


async public Task<StandardResponse> CreatePageWithUrl(bool debug)
{
  var client = new HttpClient();
  // Note: API only supports JSON return type.
  client.DefaultRequestHeaders.Accept.Add(
    new MediaTypeWithQualityHeaderValue("application/json"));
  // This allows you to see what happens when an unauthenticated call is made.
  if (this.IsAuthenticated)
    {
      client.DefaultRequestHeaders.Authorization = 
        new AuthenticationHeaderValue("Bearer", 
        this._authClient.Session.AccessToken);
    }
  string date = GetDate();
  string simpleHtml = 
    "<html>" +
    "  <head>" +
    "    <title>A page created with an image from a URL on it</title>" +
    "    <meta name=\"created\" content=\"" + date + "\" />" +
    "  </head>" +
    "  <body>" +
    "    <p>This is a page with an image of an HTML page " +
    "       rendered from a URL.</p>" +
    "    <img data-render-src=\"http://www.onenote.com\"/>" +
    "  </body>" +
    "</html>";
  var createMessage = 
    new HttpRequestMessage(HttpMethod.Post, PagesEndPoint)
      {
        Content = new StringContent(simpleHtml, 
          System.Text.Encoding.UTF8, "text/html")
      };
  HttpResponseMessage response = await client.SendAsync(createMessage);
  return await TranslateResponse(response);
}

The next code sample creates a more-complex multi-part POST request that includes HTML in both a "Presentation" block as well as a second block named "embedded1". The embedded1 block HTML is sent to the rendering engine, and the API inserts the resulting image into the new OneNote page. This code is modified from of the Windows Store sample on Github.


async public Task<StandardResponse> CreatePageWithEmbeddedWebPage(bool debug)
{
  var client = new HttpClient();
  // Note: API only supports JSON return type.
  client.DefaultRequestHeaders.Accept.Add(
    new MediaTypeWithQualityHeaderValue("application/json"));
  // This allows you to see what happens when an unauthenticated call is made.
  if (this.IsAuthenticated)
    {
      client.DefaultRequestHeaders.Authorization = 
        new AuthenticationHeaderValue("Bearer", 
        this._authClient.Session.AccessToken);
    }
  string date = GetDate();
  string simpleHtml = 
    "<html>" +
    "  <head>" +
    "    <title>A page created with an image of an html page on it</title>" +
    "    <meta name=\"created\" content=\"" + date + "\" />" +
    "  </head>" +
    "  <body>" +
    "    <h1>This is a page with an image of an HTML page on it.</h1>" +
    "    <img data-render-src=\"name:" + embeddedPartName + "\"/>" +
    "  </body>" +
    "</html>";
  const string embeddedPartName = "embedded1";
  const string embeddedWebPage = 
    "<html>" +
    "  <head>" +
    "    <title>Embedded HTML</title>" +
    "  </head>" +
    "  <body>" +
    "    <h1>This is a snapshot of a web page</h1>" +
    "    <p>...containing a list: </p>" +
    "    <ul>" + 
    "      <li>List Item One</li>" +
    "      <li>List Item Two</li>" +
    "    </ul>" +
    "  </body>" +
    "</html>";

  var createMessage = new HttpRequestMessage(HttpMethod.Post, PagesEndPoint)
    {
      Content = new MultipartFormDataContent
        {
          {new StringContent(simpleHtml, System.Text.Encoding.UTF8, 
             "text/html"), "Presentation"},
          {new StringContent(embeddedWebPage, System.Text.Encoding.UTF8, 
             "text/html"), embeddedPartName}
        }
     };
  HttpResponseMessage response = await client.SendAsync(createMessage);
  return await TranslateResponse(response);
}

The following code shows a REST example of how to include a thumbnail of a web page on a new OneNote page using the img tag with a data-render-src attribute.


Content-Type:multipart/form-data; boundary=MyAppPartBoundary
Authorization:bearer tokenString

--MyAppPartBoundary
Content-Disposition:form-data; name="Presentation"
Content-type:text/html

<!DOCTYPE html>
<html>
  <head>
    <title>Title of the captured OneNote page</title>
    <meta name="created" value="2013-06-11T12:45:00.000-8:00"/>
  </head>
  <body>
    <p>This Presentation block displays a thumbnail of the OneNote site.</p>
    <img data-render-src="http://www.onenote.com" width="200"/>
  </body>
</html>

--MyAppPartBoundary--

The following code shows how to include a block of HTML and have that block rendered as an image and inserted into the new OneNote page.


Content-Type:multipart/form-data; boundary=MyAppPartBoundary
Authorization:bearer tokenString

--MyAppPartBoundary
Content-Disposition:form-data; name="Presentation"
Content-type:text/html

<!DOCTYPE html>
<html>
  <head>
    <title>A simple page with an embedded image of a block of HTML</title>
  </head>
  <body>
    <p>This is a simple Presentation block.</p>
    <p>This next image specifies the image data is in this POST request.</p>
    <img data-render-src="name:MyAppHtmlId" alt="a cool image" width="500"/>
  </body>
</html>

--MyAppPartBoundary
Content-Disposition:form-data; name="MyAppHtmlId"
Content-type:text/html

<!DOCTYPE html>
<html>" +
  <head>" +
    <title>Embedded HTML</title>" +
  </head>" +
  <body>" +
    <h1>This is a snapshot of a web page</h1>" +
    <p>...containing a list: </p>" +
    <ul>" + 
      <li>List Item One</li>" +
      <li>List Item Two</li>" +
    </ul>" +
  </body>" +
</html>";

--MyAppPartBoundary--


Show: