Dividers to the right, please.

RegExp hell... (Coding Challenge)

Ok, the URL matching and replacing regexps here don't work as well as they should.  I tried fixing them last night but nothing I did seems to work right.  The raw code is:

$value = preg_replace('/(http:\/\/[A-Za-z0-9_-]+(?:[.][A-Za-z0-9_-]+)+\S*)/xims', '<a href="$1">$1</a>', $value);
$value = preg_replace('/(?<!http:\/\/)(www[.][A-Za-z0-9_-]+(?:[.][A-Za-z0-9_-]+)+\S*)/xims', '<a href="http://$1">$1</a>', $value);
$value = preg_replace('/([A-Za-z0-9_.-]+@[A-Za-z0-9_-]+(?:[.][A-Za-z0-9_-]+)+)/xims', '<a href="mailto:$1">$1</a>', $value);

However, we still have the problem of:

aaahttp://www.example.com

Anytime I try and fix that, the regexp stops working entirely on what I want it to do.

The raw PERL-compatible regexps:

/(http:\/\/[A-Za-z0-9_-]+(?:[.][A-Za-z0-9_-]+)+\S*)/
/(?<!http:\/\/)(www[.][A-Za-z0-9_-]+(?:[.][A-Za-z0-9_-]+)+\S*)/
/([A-Za-z0-9_.-]+@[A-Za-z0-9_-]+(?:[.][A-Za-z0-9_-]+)+)/

In addition to just working like JOS, I'd like the URL matching to work a little better: I don't want them to match to a trailing period, comma, or trailing brackets or braces -- I didn't have much luck with that, either.

+100 bonus points for the best link-matching regexp.
Permalink Send private email Almost H. Anonymous 
January 30th, 2006 11:06am
The above example you gave seems fine.
Permalink Send private email muppet 
January 30th, 2006 11:08am
It's not supposed to match.
Permalink Send private email Almost H. Anonymous 
January 30th, 2006 11:08am
why not?

Anyway that shit's too convoluted for me to play with today.  I'm burned out on regexps  :)
Permalink Send private email muppet 
January 30th, 2006 11:09am
No space before the http.
Permalink Send private email Almost H. Anonymous 
January 30th, 2006 11:10am
None of your regexps is LOOKING for a space before the http.

You need some \b's in there.
Permalink Send private email muppet 
January 30th, 2006 11:11am
Ummm.. tried that, from the OP:

"Anytime I try and fix that, the regexp stops working entirely on what I want it to do."
Permalink Send private email Almost H. Anonymous 
January 30th, 2006 11:13am
That makes no sense.
Permalink Send private email muppet 
January 30th, 2006 11:13am
No kidding!!  Very frustrating.
Permalink Send private email Almost H. Anonymous 
January 30th, 2006 11:14am
...or I simply was too tired last night -- I think I was using the wrong backlashed character (maybe \w).  Ok, one problem solved.  Now how would I stop the last period, comma, bracket, or brace from matching?
Permalink Send private email Almost H. Anonymous 
January 30th, 2006 11:20am
Why not let RegEx match URL with Comma, bracket, brace and with the current match, check if the last character(s) are in your black list and remove them?
Permalink Send private email JD 
January 30th, 2006 11:23am
I know jack about regular expressions, but some searching around turned up this:

http://gnosis.cx/publish/programming/regular_expressions.html

The very end of the page may be useful to those that understand what he's explaining.
Permalink Send private email Jacob 
January 30th, 2006 11:26am
AHA,

Could you please respond to my emial with request for SVN access?

Thanks,
Permalink Send private email JD 
January 30th, 2006 11:31am
Have you compared your one with those on offer at www.regexlib.com to see what is different?
Permalink Not Waving But Drowning 
January 30th, 2006 11:57am
I dont know whether I'm mad or not, but I think I have a wuunderful idea - check for a 2xx return status for the URL. That will avoid all that reg what.
Permalink Send private email Vineet Reynolds 
January 30th, 2006 12:30pm
In order to test the return from the URL you have to FIND the URL.

Moron.
Permalink Send private email muppet 
January 30th, 2006 12:31pm
And why do you need to find the URL ?
Permalink Send private email Vineet Reynolds 
January 30th, 2006 12:38pm
To test it.
Moron
Permalink Send private email Masiosare 
January 30th, 2006 12:39pm
How are you going to programmatically test the URL if you don't know what the URL is?  Explain it to us, magic man.
Permalink Send private email muppet 
January 30th, 2006 12:40pm
http://www.truerwords.net/articles/ut/urlactivation.html
Permalink CynicallyYoursBeyotch 
January 30th, 2006 12:45pm
Testing a sliding window of text to see if its maybe a URL by checking the error return from HTTP is akin to expecting Hamlet from a barrel load of monkeys without giving them a typewriter ribbon.
Permalink Send private email Simon Lucy 
January 30th, 2006 1:27pm
Psshhh... if it's reasonable to expect the literature it's reasonable to expect the engineering of an ink ribbon.

Silly man.

:)
Permalink Send private email I am Jack's expletive exception 
January 30th, 2006 1:53pm

This topic is archived. No further replies will be accepted.

Other topics: January, 2006 Other topics: January, 2006 Recent topics Recent topics