Sunday, July 26, 2009

Reading Content of a Website Page



This short post shows how to read content of a web page by having its web address in ASP.NET.

In desktop applications, this task could simply be done by using a web browser control, i.e. System.Windows.Controls.WebBrowser, but in ASP.NET we need to use HttpWebRequest and HttpWebResponse objects as follows:

public static StringBuilder GetDocumet(string url)
{
StringBuilder rtn;
HttpWebRequest request;
HttpWebResponse response = null;
Stream stream = null;
StreamReader sr = null;

try
{
request = (HttpWebRequest)WebRequest.Create(url);
response = (HttpWebResponse)request.GetResponse();
stream = response.GetResponseStream();
sr = new StreamReader(stream);

rtn = new StringBuilder(sr.ReadToEnd());
}
catch
{
rtn = null; // not found
}

sr.Close();
stream.Close();
response.Close();

return rtn;
}