Step 2: Add the Code for the Federated Search HTML to RSS Converter [Search Server 2008]

The following code passes a user query to the Live Search Web site and then converts the resulting HTML into an RSS feed.

The complete code for this sample is available in HTML to RSS Federated Search Connector.

Create the RSS Feed

  1. In the Default.aspx file, change the Inherits page property so that it will use the new class you will create in the Default.aspx.cs (code-behind) file.


    For the following code to work correctly, the name of the page that will load and display the feed must be Default.aspx.

    <%@ Page Language="C#" AutoEventWireup="true"  CodeFile="Default.aspx.cs" Inherits="SearchHTMLToRSS" %>
  2. In the Default.aspx.cs file, add the following namespace directives.

    using System;
    using System.Web;
    using System.Web.UI;
    using System.Web.UI.HtmlControls;
    using System.Text;
    using System.IO;
    using HtmlAgilityPack;
  3. Modify the default class declaration so that it uses the class name that is used in this solution.

    public partial class SearchHTMLToRSS : System.Web.UI.Page
  4. Replace the default Page_Load method with the following code.

        protected override void Render(HtmlTextWriter writer)
            //Retrieve query term from query string; construct search URL
            string queryTerm = Request.QueryString["q"];
            string searchURL = string.Format("{0}", queryTerm);
            Response.ContentType = "text/xml";
            //Write the RSS document to the HTMLTextWriter object
            writer.Write(GetResultsXML(searchURL, queryTerm));
  5. Add the code for the GetResultsXML method, which queries the search site and creates an RSS document from the resulting HTML. After finding the div class that contains the results, this code extracts the information it needs and creates an RSS feed that contains that information.

       private string GetResultsXML(string searchURL, string queryTerm)
            //Construct and execute the HTTP request
            HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(searchURL);
            HttpWebResponse response = (HttpWebResponse)request.GetResponse();
            //Begin writing the RSS document
            StringBuilder resultsXML = new StringBuilder();
            resultsXML.Append("<?xml version=\"1.0\" encoding=\"utf-8\"?>");
            resultsXML.Append("<rss version=\"2.0\">");
            resultsXML.AppendFormat("<channel><title><![CDATA[HTML to RSS Conversion: {0}]]></title><link/><description/><ttl>60</ttl>", queryTerm);
                HtmlWeb hw = new HtmlWeb();
                HtmlDocument doc = hw.Load(searchURL);
                //Find the <div> tag that contains the results
                HtmlNodeCollection nodeCollection = doc.DocumentNode.SelectNodes("//div[@id='results']");
                foreach (HtmlNode htmlNode in nodeCollection)
                    foreach (HtmlNode subNode in htmlNode.ChildNodes)
                        //Find the list that contains the result items
                        if (subNode.Name == "ul")
                            foreach (HtmlNode lineItemNode in subNode.ChildNodes)
                                //Excluding line items that are children of others, because we are interested in the main set of results
                                if (((lineItemNode.Attributes.Count > 0) && (lineItemNode.Attributes[0].Value != "child")) || (lineItemNode.Attributes.Count == 0))
                                    StringWriter descWriter = new StringWriter();
                                    StringWriter titleWriter = new StringWriter();
                                    StringWriter linkWriter = new StringWriter();
                                    //After retrieving the values sought from the markup, HTML-encode 
                                    //the strings to avoid validation errors
                                    string description = lineItemNode.ChildNodes[1].InnerText;
                                    Server.HtmlEncode(description, descWriter);
                                    string encDescription = descWriter.ToString();
                                    string title = lineItemNode.FirstChild.FirstChild.InnerText;
                                    Server.HtmlEncode(title, titleWriter);
                                    string encTitle = titleWriter.ToString();
                                    string link = lineItemNode.FirstChild.FirstChild.Attributes[0].Value;
                                    Server.HtmlEncode(link, linkWriter);
                                    string encLink = linkWriter.ToString();
                                    if (lineItemNode.FirstChild.FirstChild.Attributes[0].Name == "href")
                                        //Write each RSS item
                                        resultsXML.AppendFormat("<item><title>{0}</title><link><![CDATA[{1}]]></link><description>{2}</description></item>", encTitle, encLink, encDescription);
            //Complete RSS document
            return resultsXML.ToString();
  6. Deploy this solution to your Web site. To deploy this solution to your Search Server site, save the contents of this solution in your _layouts directory. For more information, see How to: Create a Web Application in a SharePoint Web Site.

  7. Load the Default.aspx file in your Web browser or RSS reader to verify that it is creating an RSS feed. Add test query strings (?q=search terms) to the URL to verify that the feed is returning results.

Next Steps

See Also

Community Additions