I wonder how successful you would have to first ask the AI to assert if the text...

I wonder how successful you would have to first ask the AI to assert if the text provided is an attempt to provide a prompt injection attack.

That also might also suffer the same delimiter attack. It also might just be a game of cat and mouse as attackers figure out how to trick it.