Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Any time CDs/CD-ROMs (as opposed to DVDs) are discussed, I get really nervous unless someone explicitly confirms that the tool being used handles XA/2352-bytes-per-sector data correctly. Supposedly cdrdao will.[1]

DVDs are for the most part pretty easy to duplicate identically. CDs have a lot of quirks, and it's very easy to end up with a useless "backup". Especially if the file format is ISO instead of bin/cue or similar. If one is trying to duplicate CDs/CD-ROMs, it's really vital to cross all your Ts and dot all your Is.

[1] https://consolecopyworld.com/psx/psx_copy_patch_linux.shtml



Is there a reason that the copying process needs to be aware of how bytes are organized? Why can’t you just read the bytes from disk A and write those exact bytes to disk B?


Because depending on what level the software is operating at, it may not even be aware of the correct data, and the file format used to store the image might not be capable of representing it accurately.

I'm simplifying here somewhat for space, but the CD-ROM specifications allow for at least two ways of storing the data on the disc.

A 100% vanilla data CD-ROM with no copy protection uses 2048 bytes per sector for the data that's visible to you as a user, and the remaining 304 bytes in that sector are used for data that helps recover the user-level data if any of it is unreadable (kind of like a RAID5 setup).[1]

Mode 2/XA discs (or mode 2/XA sections on mixed-mode discs) use those 304 bytes per sector to store user-level data instead. i.e. they are trading more space for less reliability. PlayStation games are the most common example. If you've ever tried to copy XA audio or STR video files off of a PlayStation disc in Windows and wondered why you got an error, or why the files copied but were corrupted, that's why. Your PC was only copying 2048 out of every 2352 bytes in each sector.

The ISO file format for discs can ONLY support 2048 bytes per sector. This is why groups like Redump use bin/cue for anything that comes off of a CD. If you convert a bin/cue to ISO, you are throwing away a little over 10% of the data in every sector.

If you want to learn more about this, read the specs for different types of CDs and CD-ROMs. It's a big mess, and I think everyone is happier that the industry standardized down to fewer options in the DVD era.

[1] This is in addition to the physical-level redundant data encoded in the pits on the disc itself, but AFAIK almost nothing can read discs at that level.


Which bytes? A raw 2352 byte copy that includes the P and Q channels (timing) can behave differently than a 2048 byte copy depending on the software.


Ok but why should we care about that in copying?

This is my mental model for what's going on:

There's a bunch of bytes stored on a CD-ROM in a defined order. Zeroes and ones. Copy them in order. You should now have a file on magnetic disk or flash that is precisely those byes in precisely that order. Anything that can make sense of one should be able to make sense of the other.

What am I missing here?


There is more than one set of bytes.

You may not care about copying the inode structure when you are copying your files from point a to point b, but you should when you are cloning the disk.

Many times the software on the CD is looking at the physical layout of the disk, not just the logical data, to function correctly.


So filesystem meta data is being copied that references the physical cd but it's no longer associated with that cd once copied to magnetic storage? Is that what's going on here?


No. Please see my other post in this thread. I'm struggling to think of a non-technical analogy, but basically what's going on here is that most data-copying tools, when used in the most common ways, are looking at CD-ROMs like they would look at the logical disk presented by a RAID5 array.

That works fine most of the time. But imagine if the computing industry had come up with a "RAID5 Mode 2" that was actually RAID0, where the parity disk was just used to store more user data instead of parity data, but most of the copying tools didn't know the difference between "RAID5" and "RAID5 mode 2", and so they just copied 2/3 of the data, on the assumption that the parity data would be recreated on the receiving end. 1/3 of the user data just went down the drain. That's basically what's happening when you try to store anything other than a 100% vanilla data CD-ROM as an ISO file instead of bin/cue.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: