Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't database, but I like to think I have some kind of intuition for storage space requirements, and this article was very confusing.

Ignoring the indexes and just focusing on the main table sizes reported, we have:

- String ("The frequent repetition of these names inflates the size of the table"): 392 MB

- Enum data type ("Internally, an enum type is stored as four-byte floating point number. So it saves space in the table [...]"): 338 MB

- Lookup table ("Also, since a smallint only occupies two bytes, the person_l table can potentially use less storage space than the other solutions"): 338 MB.

I just can't make sense of the numbers, especially given the authors comments that I've quoted.

Is this some kind of typo/editing fail?



I'm also wondering about that. But maybe this could be it?

> Surprisingly, the table is just as big as with the enum type above, even though an enum uses four bytes. The reason is that each table row is aligned at a memory address divisible by eight, so PostgreSQL will add six padding bytes after the smallint. If we had more columns and could arrange them carefully, we could see a difference.

This could be the explanation. If the row is padded to 8, bigint is 8, then smallint or enum also use 8. The entries in the string table will be 8 or 16 due to the string length. So one row in person_e and person_l is 16, one row in person_s could be about 20 on average, that is a bit closer to the reality than my intuition, although the storage savings are still less than what I would have expected.

edit:

I did also try out the test and dropped the primary key on the table to compare only enum and string size:

  SELECT PG_SIZE_PRETTY(PG_RELATION_SIZE('person_e')), PG_SIZE_PRETTY(PG_RELATION_SIZE('person_s'))

  277 MB,330 MB
Does not look like an amazing saving either.


> Enum type 4-byte floating point number

This is why the storage is weird. Why would you use a float for distinct number storage!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: