You're probably using some extension and not a "strict" regular expression engine. Regular expressions describe regular languages, which are level 3 in the Chomsky hierarchy https://en.wikipedia.org/wiki/Chomsky_hierarchy and they formally, provably, do not have the expressive power to describe HTML. This has already been posted in this thread, but make sure to read Larry Wall's quote in the second answer: http://stackoverflow.com/questions/6751105/why-its-not-possi...
Possibly there is some confusion about the word "parsing". If it is just the question of scanning for some substring or pattern on a webpage, sure you can do that with a regex. But this is not what is usually considered parsing.
Right. And that confusion is probably fueled by the existence of smart (invalid html swallowing) parsers mainly used in scraping, like beautiful soup and nokogiri.