Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As someone who has written a lot of regexen, I’m surprised it ever worked.


I used to scrape craigslist for local offers. Many times I needed to be the first to call, so I wanted my script to get the numbers.

In about five minutes I came up with an algorithm that defeated most of the obfuscation tricks of the time:

Strip whitespace, convert "five" to 5, remove special characters, and look for ten consecutive digits in the body text. Maybe a couple of small tweaks after that like removing text between digits.

Most people, when they invent their little unique scheme, invent one that is already defeated by the algorithm above. At least for phone numbers.

Anyway, all that to say you're right, it never worked.


Spammers aren't that smart, they will absolutely go where the bar is at its absolutely lowest, if not missing entirely. Just changing your SSH server to operate a port that is not 22 is going to stop most if not all random login attempts.


I wouldn’t characterize that example as “not smart” to be honest. I’m not particularly smart, but if someone has gone through the trouble of changing their secure socket port from 22, I would realize there would be more work and time involved in finding the actual port, and then it would be a safe assumption that, if found, it would have significantly more security hardening than the average port 22 because clearly the admin knows enough to change ports. So I might even characterize it as smart to not go after ssh ports that are not 22.


The reason is different. When you need to scan 5 million IPs on 22 port is one thing, but 5 million IPs times 100 or 1000, then you simply run out of resources and since 99% of people use 22, why bother?


> As someone who has written a lot of regexen

There are at least \d of us!


More than that, there are \d+ of us! Or, since I’ve been writing Lua recently, %d+ of us.


Heh I almost edited it to \d+ but also I considered [0-9]{2,} and then I thought more on it and realized (?<!^)\d(?!$) is pretty open ended.


You need to upgrade your knowledge. Use only [2-9]. That's it.


There are at least 1 of us!

Looks weird :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: