Include contents of an html page in an aspx page

If you are creating a new ASP.NET application, but have a huge collection of existing content in html files, one option is to move all the content into a database and generate pages dynamically. However, migration to a database can be a time-consuming task depending on the volume of content. So wouldn't it be easier to somehow import the relevant parts of the existing html pages into your aspx page?

The use of System.IO and regular expressions makes this a very easy task. Place a <asp:Literal> control (ID="htmlbody") on your page, and then use the following code to strip out everything up to and including the <body> tag (regardless of whether the tag contains additional attributes), and everything from the closing </body> tag onwards:

 StreamReader sr;
 string html;
 sr = File.OpenText("<path_to_file.htm>");
 html = sr.ReadToEnd();
 sr.Close();

 Regex start = new Regex(@"[\s\S]*<body[^<]*>", RegexOptions.IgnoreCase);
 html = start.Replace(html,"");
 Regex end = new Regex(@"</body[\s\S]*", RegexOptions.IgnoreCase);
 html = end.Replace(html, "");
 htmlbody.Text = html;

Date Posted: Saturday, May 5, 2007 8:44 PM
Last Updated: Saturday, May 16, 2009 5:36 PM
Posted by: Mikesdotnetting
Total Views to date: 56218

3 Comments

Wednesday, November 18, 2009 7:31 AM - pranav

what does below code means?

"[\s\S]*<body[^<]*>"

Wednesday, November 18, 2009 7:56 PM - Mike

@pranav

It's part of a regular expression pattern. It attempts to locate the body tag in the html, and allows for cases where there might be inline styling or javascript onload function calls. Or indeed anything else.

Tuesday, March 23, 2010 10:37 PM - Jason

I have a aspx page where I have it inside of a master page. I want to add javascript but I know you can only do this in a html page. I am just wanting to know is ther a way to add javascript to a aspx page.
Add your comment

If you have any comments to make about this article, please use this form to do so. Make sure that your comment relates specifically to the article above. More general comments can be posted through the form on the Contact page.

Please note, all comments are moderated, and some may not be published. The kind of things that will ensure your comment is deleted without ever seeing the light of day are as follows:

  • Not relevant to the article
  • Gratuitous links to your own site or product
  • Anything abusive or libellous
  • Spam
  • Anything in a language I don't understand including gibberish.

I do not pass email addresses on to spammers, so a valid one will assist me in responding to you personally if required.