I've always seen "sanitization" as more of an output-encoding problem. People lo...

0xbadcafebee · on April 14, 2014

The purpose of sanitizing input is not to prevent security vulnerabilities. It is to make sure the values taken by your program are valid. If you accept a number range, and the user inputs a word, it's invalid input for your parameter and your program will crash. Input sanitizing validates the input is correct for your use. It indirectly improves security, but is not itself a practice of making an app more secure.

nubs · on April 14, 2014

The term "sanitizing" is not used to reference this, as commented on, what you are describing is "validating" the user input. That should, of course, happen. Many validations will result in only accepting input that happens to be safe for many uses - i.e., if it's a valid number between 1-100 you could of course send it to an integer field in a database without doing any special encoding, but I wouldn't rely on my input validation doing this in my model layer.

Encoding a "safe" value doesn't make things any less safe. Failure to encode it, however, leaves potential holes in your application. Something may bypass input validation and be given to the database as an unsafe, unvalidated value. Usage of the value may change (new functionality using it differently, changed storage in database, etc) and in the new usage the value may not be safe.

Input validation is obviously something you want to do, but it should never be relied upon for protecting from injection attacks.

mantrax4 · on April 14, 2014

You actually said it, which is funny, but the right word for this is "validating".

Here's the chain:

1. Get raw input.

2. Validate it (number, not number, in range, not in range?)

3. Optionally format it to canonical format (i.e. trim whitespace etc.)

... later....

4. Encode it for where you want to use it (SQL, HTML etc.).

Sometimes steps 2 and 3 are done in the opposite order, or as an atomic single operation, but point is, we have perfectly reasonable words for all that: validating, formatting, encoding.

mantrax4 · on April 14, 2014

This is why I just call it encoding and decoding. Proper words, and assume context (encoding for what... decoding from what).