Alright nerds, who can guess what this RegEx matches?

Pastel@sopuli.xyz · edit-2 3 months ago

Alright nerds, who can guess what this RegEx matches?

xombie21@lemmy.dbzer0.com · 3 months ago

That’s John Gruber’s regex pattern for matching URL’s (⌐■_■).

erayerdin@programming.dev · 2 months ago

truly a sunglasses moment indeed

marlowe221@lemmy.world · 3 months ago

This is an example of the old adage that “When you use a regex to solve a problem, you end up with two problems.”

neidu3@sh.itjust.works · edit-2 2 months ago

Looks like an URL matcher of some sorts, not limited to HTTP. Kudos for handling parentheses as valid URL characters.

refalo@programming.dev · 3 months ago

URLs can have newlines too

hoshikarakitaridia@lemmy.world · edit-2 2 months ago

/unlearn

Venator@lemmy.nz · 2 months ago

It seems most browsers basically ignore them:

https://lemire.me/blog/2026/02/28/you-can-use-newline-characters-in-urls/

So probably not worth remembering anyway.

Sphks@jlai.lu · 2 months ago

What. The. Fuck.

Quantenteilchen@discuss.tchncs.de · 2 months ago

Also no encoded basic auth or raw ip addresses (not that a useful website would likely use raw ipv4 or 6 since that causes huge CORS and sometimes even DNS issues…)

bleistift2@sopuli.xyz · edit-2 3 months ago

As visualized by Regex Vis [1]

As visualized by Regexper [2]

The regex fucks with the markdown, so I had to put them in code tags:

[1] https://regex-vis.com/?r=%5Cb%28%28%3F%3A%28%3F%3A%5Ba-z%5D%5B%5Cw-%5D%2B%3A%29%3F%28%3F%3A%2F%7B1%2C3%7D%7C%5Ba-z0-9%25%5D%29%7Cwww%5Cd%7B0%2C3%7D%5B.%5D%7C%5Ba-z0-9.%5C-%5D%2B%5B.%5D%5Ba-z%5D%7B2%2C4%7D%2F%29%28%3F%3A%5B%5E%5Cs%28%29%3C%3E%5D%2B%7C%5C%28%28%5B%5E%5Cs%28%29%3C%3E%5D%2B%7C%28%5C%28%5B%5E%5Cs%28%29%3C%3E%5D%2B%5C%29%29%29*%5C%29%29%2B%28%3F%3A%5C%28%28%5B%5E%5Cs%28%29%3C%3E%5D%2B%7C%28%5C%28%5B%5E%5Cs%28%29%3C%3E%5D%2B%5C%29%29%29*%5C%29%7C%5B%5E%5Cs%60%21%28%29%5C%5B%5C%5D%7B%7D%3B%3A%27%22.%2C%3C%3E%3F%C2%AB%C2%BB%E2%80%9C%E2%80%9D%E2%80%98%E2%80%99%5D%29%29

[2] https://regexper.com/#\b((?:(?:[a-z][\w-]%20:)?(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()%3C%3E]+|\(([^\s()%3C%3E]+|(\([^\s()%3C%3E]+\)))*\))+(?:\(([^\s()%3C%3E]+|(\([^\s()%3C%3E]+\)))*\)|[^\s%60!()\[\]{};:'%22.,%3C%3E?%C2%AB%C2%BB%E2%80%9C%E2%80%9D%E2%80%98%E2%80%99]))

Aatube@lemmy.dbzer0.com · 3 months ago

check out Regulex! it doesn’t support mode modifiers but it does lack some features but i really like how its graphs look

ulterno@programming.dev · 2 months ago

Nice. Is there terminal/native running software with something similar?
Other than just running the HTML+JS/TS project in a container.

pewpew@feddit.it · 3 months ago

Onno (VK6FLAB)@lemmy.radio · 3 months ago

At first glance IP address or URL, embedded in HTML, whatever it is, it’s a doozy. I wonder what the performance of it is like.

towerful@programming.dev · 3 months ago

It works out as O(regex^n)

DreamButt@lemmy.world · 3 months ago

At least 2

bleistift2@sopuli.xyz · 3 months ago

Whatever this is supposed to match, I bet the bycatch is bigger than tuna fishing.

tamiya_tt02@lemmy.world · 2 months ago

Looks like the hacking mini game in Fallout 4.

DarkSirrush@piefed.ca · 3 months ago

URLs in an HTML document that aren’t namespaces or otherwise enclosed?

sudoMakeUser@sh.itjust.works · 3 months ago

Hold on, let me draw up the NFA

Olgratin_Magmatoe@slrpnk.net · 3 months ago

Kamikaze Rusher@lemmy.world · 3 months ago

Probably documents from HP’s atrocious support site

NotMyOldRedditName@lemmy.world · 3 months ago

Is it a rick roll?

irelephant [he/him]@lemmy.dbzer0.com · 14 days ago

www
urls