Converting URLs Into Links With Regex

Following on from the recent spate of extensions methods I've posted, here's another I use to convert URLs and email addresses into links within HTML. You may want to restrict users from submitting HTML tags via forms in your application, which means that URLs and email addresses that they submit appear as plain text unless they are subjected to some kind of processing.

/// <summary>
/// Finds web and email addresses in a string and surrounds then with the appropriate HTML anchor tags 
/// </summary>
/// <param name="s"></param>
/// <returns>String</returns>
public static string WithActiveLinks(this string s)
{
  //Finds URLs with no protocol
  var urlregex = new Regex(@"\b\({0,1}(?<url>(www|ftp)\.[^ ,""\s<)]*)\b", 
    RegexOptions.IgnoreCase | RegexOptions.Compiled);
  //Finds URLs with a protocol
  var httpurlregex = new Regex(@"\b\({0,1}(?<url>[^>](http://www\.|http://|https://|ftp://)[^,""\s<)]*)\b", 
    RegexOptions.IgnoreCase | RegexOptions.Compiled);
  //Finds email addresses
  var emailregex = new Regex(@"\b(?<mail>[a-zA-Z_0-9.-]+\@[a-zA-Z_0-9.-]+\.\w+)\b", 
    RegexOptions.IgnoreCase | RegexOptions.Compiled);
  s = urlregex.Replace(s, " <a href=\"http://${url}\" target=\"_blank\">${url}</a>");
  s = httpurlregex.Replace(s, " <a href=\"${url}\" target=\"_blank\">${url}</a>");
  s = emailregex.Replace(s, "<a href=\"mailto:${mail}\">${mail}</a>");
  return s;
}

This will convert most URLs, but not all. Parsing URLs is not the easiest thing to do so you need to make a judgement on what type of URLs your users/visitors are most likely to provide and alter the regex patterns accordingly. One thing to point out is that the second pattern (the one that matches URLs with a protocol - http, https etc) also checks to make sure that it isn't already a hyperlink. By the time the second Replace() operations takes place, URLs without protocols will already be fitted with them, and have HTML surrounding them.

 

Date Posted: Saturday, May 22, 2010 10:56 PM
Last Updated:
Posted by: Mikesdotnetting
Total Views to date: 10270

1 Comment

Saturday, December 4, 2010 8:02 PM - Tvrtko

Hello Mike,
this works great, however, part with email link will duplicate anchor tag. For http urls you did ignore existing html anchors from match correctly , but for email they get matched even if they have html anchor around already.
Add your comment

If you have any comments to make about this article, please use this form to do so. Make sure that your comment relates specifically to the article above. More general comments can be posted through the form on the Contact page.

Please note, all comments are moderated, and some may not be published. The kind of things that will ensure your comment is deleted without ever seeing the light of day are as follows:

  • Not relevant to the article
  • Gratuitous links to your own site or product
  • Anything abusive or libellous
  • Spam
  • Anything in a language I don't understand including gibberish.

I do not pass email addresses on to spammers, so a valid one will assist me in responding to you personally if required.