Possible to find what HD space gets overwritten last?

Coldblackice · May 10, 2014

Is it possible to put files into sectors of a HD that will be (over)written to very last?

For example, if I have an HD with a hundred 1MB slots, I'm wondering if it's possible to know beforehand which of those hundred slots will be the very last ones to be written to (when the HD is being filled up with data). Ideally, I'd like to store "overwriteable" files in these tail-end sectors that won't be overwritten until all other HD sectors are full/written.

So in terms of the example, if I knew beforehand that sectors #97, 98, and 99 would be the very last ones to get written to, I'd like to store a file in each of those, but while still having those sectors marked as "writeable" space, so the OS sees them as "free" space.

Is this possible? I've wondered if this might be somehow doable via partitioning, but I'm unsure whether the OS or the HD firmware makes the determining choice of which available sectors get written to next.

teh_chem · May 10, 2014

I don't think it's possible as the disk space for storing a file is not known by the controller before the data is to be written. Controllers can write data anywhere on the partition. You'd have no idea if the files are of appropriate size or when to write what data.

drescherjm · May 10, 2014

For a hard disk the OS determines where the files get placed in the filesystem.

Dreamerbydesign · May 11, 2014

Coldblackice said:
Is it possible to put files into sectors of a HD that will be (over)written to very last?

For example, if I have an HD with a hundred 1MB slots, I'm wondering if it's possible to know beforehand which of those hundred slots will be the very last ones to be written to (when the HD is being filled up with data). Ideally, I'd like to store "overwriteable" files in these tail-end sectors that won't be overwritten until all other HD sectors are full/written.

So in terms of the example, if I knew beforehand that sectors #97, 98, and 99 would be the very last ones to get written to, I'd like to store a file in each of those, but while still having those sectors marked as "writeable" space, so the OS sees them as "free" space.

Is this possible? I've wondered if this might be somehow doable via partitioning, but I'm unsure whether the OS or the HD firmware makes the determining choice of which available sectors get written to next.

What you are asking is not possible. The OS sends info to the controller, neither of which can be manually controlled like that.

Coldblackice · May 12, 2014

teh_chem said:
...Controllers can write data anywhere on the partition. You'd have no idea if the files are of appropriate size or when to write what data.

drescherjm said:
For a hard disk the OS determines where the files get placed in the filesystem.

I'm confused -- is it the OS that determines where on disk something will go, or the controller?

jmilcher said:
What you are asking is not possible. The OS sends info to the controller, neither of which can be manually controlled like that.

What about defragging then? Isn't the OS rearranging data on the disk, moving it all toward one end?

And perhaps a better question -- when you're about to initiate a file copy to another drive, who determines the precise sectors where that file will begin to be laid down (OS or controller), and how are those sectors chosen (i.e. how/why did the OS/controller choose sector #2047 as where to start laying this new file down)?

evilsofa · May 12, 2014

Coldblackice said:
What about defragging then? Isn't the OS rearranging data on the disk, moving it all toward one end?

Defraggers rearrange by logical sectors. Logical sectors on an HDD may or may not have anything to do with physical sectors, especially if some of the sectors have been reallocated. Also, those pretty pictures of a defragged disk's sectors with all the data on one end don't tell you about the multiple platters (up to 6 these days) in a modern HDD.

Allowing app writers and users access to raw physical attributes in computers always leads to very bad assumptions and behaviors. For example, the need for the abstraction between the OS and the controller for the physical location of data on a drive becomes most apparent with SSDs, which due to wear leveling algorithms has data physically moving around the drive. So, a lot of older HDD utilities (like DBAN) do not work properly on SSDs. I think SSDs might now be past the point where the wrong HDD utility could actually damage or brick them, but I'm not totally certain about that.

teh_chem · May 12, 2014

Coldblackice said:
I'm confused -- is it the OS that determines where on disk something will go, or the controller?

What about defragging then? Isn't the OS rearranging data on the disk, moving it all toward one end?

And perhaps a better question -- when you're about to initiate a file copy to another drive, who determines the precise sectors where that file will begin to be laid down (OS or controller), and how are those sectors chosen (i.e. how/why did the OS/controller choose sector #2047 as where to start laying this new file down)?

The OS talks to the controller and says 'I have a file this big, please save it to disk.'. The controller on the drive says, 'okay.'. When the controller is done, it tells the OS which sectors and what bits belong to the file. It's like a car parking valet service. You can drop off and pick up your car, but only the valet service can chose where it gets parked, and you have no say in where that is.

Defragging is only done by the OS in so far as the os is told by the controller what sectors of the disk are 'empty,' which belong to files, and which chunks of the drive contain contiguous data. If you've defragged, you'll probably recall that the 'getting disk info' part of defrag is a slow process, because the os is querying the disk via the controller for that info for the defrag program.

drescherjm · May 12, 2014

teh_chem said:
The OS talks to the controller and says 'I have a file this big, please save it to disk.'. The controller on the drive says, 'okay.'. When the controller is done, it tells the OS which sectors and what bits belong to the file. It's like a car parking valet service. You can drop off and pick up your car, but only the valet service can chose where it gets parked, and you have no say in where that is.

The controller (with the exception of a few SSDs that were rumoured to read NTFS so that they could garbage collect without TRIM) has no concept of files so I am confused what you are getting at here. The filesystem tells the controller what logical block to put sectors at. The controller maps these logical blocks to physical sectors. In a hard drive this mapping is mostly sequential with the exception of handling remapped sectors.

devman · May 12, 2014

'dd' on a *nix system can be used to write arbitrary data to arbitrary LBAs on a block device. Since you would be bypassing the filesystem that owns those LBAs, they would still be considered free. Doing this can lead to filesystem corruption if you overwrite something important on accident.

Generally I've found that it is easy to figure out what LBA's a given file is stored on, but doing the reverse lookup is non-trivial, thus determining that an LBA is 'free' can be difficult.

teh_chem said:
The OS talks to the controller and says 'I have a file this big, please save it to disk.'. The controller on the drive says, 'okay.'. When the controller is done, it tells the OS which sectors and what bits belong to the file. It's like a car parking valet service. You can drop off and pick up your car, but only the valet service can chose where it gets parked, and you have no say in where that is.

Defragging is only done by the OS in so far as the os is told by the controller what sectors of the disk are 'empty,' which belong to files, and which chunks of the drive contain contiguous data. If you've defragged, you'll probably recall that the 'getting disk info' part of defrag is a slow process, because the os is querying the disk via the controller for that info for the defrag program.

Block devices only understand blocks they have no concept of file, or even free and used space. Filesystems are an abstraction put on block devices used to manage free and used space in the familiar form of files and directories.

Application reads and writes files on the filesystem which reads and writes LBAs on the block device.

teh_chem · May 12, 2014

devman said:
Block devices only understand blocks they have no concept of file, or even free and used space. Filesystems are an abstraction put on block devices used to manage free and used space in the familiar form of files and directories.

Application reads and writes files on the filesystem which reads and writes LBAs on the block device.

drescherjm said:
The controller (with the exception of a few SSDs that were rumoured to read NTFS so that they could garbage collect without TRIM) has no concept of files so I am confused what you are getting at here. The filesystem tells the controller what logical block to put sectors at. The controller maps these logical blocks to physical sectors. In a hard drive this mapping is mostly sequential with the exception of handling remapped sectors.

Yes, I understand this. I'm just trying to break it down into something simple to convey that you can't do what the OP is asking and why, outside of the technical jargon.

raivo · May 13, 2014

teh_chem said:
The OS talks to the controller and says 'I have a file this big, please save it to disk.'. The controller on the drive says, 'okay.'. When the controller is done, it tells the OS which sectors and what bits belong to the file

But controller follows some logic, it does not write data haotically? And if it is true, can we assume it writes data to every free sector starting from the begining of the hdd platters and if hdd never goes full, it never reaches the other end of the platters (I'm not speaking about reallocating bad sectors and not so much literally about physical platters as it is virtually accessable to the OS)? I rememer old utility "Norton Speed Disk" from windows 2000, you could configure it which files *.* to put "at the end of the hdd" which to the begining, after system files and so on. So, if you do a hdd format with overwriting, these files will be last to overritten.

drakken · May 13, 2014

Ok basically this how it works.

The os quires the MBR what space is free on the logic drive. It does not know what type of physical hard ware is involved. The Master Boot Record (this is assuming you are not using grd or some other system) says I have these slots that are so big. The OS says the data is going to take up so many blocks of data, depending on smallest allowed slot. If it is one bit over it will still require one more slot. The first section of the logic table that fits all of them is used. If the table is setup to just throw them anywhere it will assign them each set of data to each slot as available in windows 7 this does not usually happen unless the drive is too full. Defrag is about putting those slots next to each other not necessarily putting everything at the front of the drive.

Once the OS has assigned the data to the logic table it sends the data to the drive controller. The controller logic or mini os for drive takes the data and places data based on info the OS does not have. Like bad sectors and so forth. All of this takes place in the ram on the drive which is why if you are swapping a lot of data around you will see the 32 meg and 64 meg drives have faster performance that is not accounted for the standard tests. But once the driver control writes what are called pointers to where the data is on the drive back to the MBR which is on windows 7 in the first 100 megs of the drive, the drive controller tells the OS that the MBR has been updated. The OS continues to the next logic drive request.

There are ways to hide info from the OS by bypassing the MBR but then you risk them getting overwritten as if it is not in the MBR there are operations that would fire off without informing the user that something is being written to the drive. Only if the square is considered to have data is the user consulted. What you want to do requires re-writing the drive controller and if you had trouble following this you have a bit of schooling to do before you are ready to tackle that.

evilsofa · May 13, 2014

It should be noted that when you format a drive, you are doing a high-level format which is performed logically rather than physically. I would think (but cannot confirm) that a high-level format such as that issued by Windows would do so sequentially through the logical sectors of a disk.

Low-level formats are ordinarily only done in the factory these days; I'm not aware of what tools will allow users to do them, and users really shouldn't be doing them anyway because they will almost certainly muck it up.

I did some looking around and there are various tools that lets you see where a file is located logically, but I'm not sure if there are any that allows a user to choose where to move a file logically. WinHex seems like a possibility, but even its feature list doesn't seem to spell out whether you can manually choose the location of a file.

Why do you even want to do this? The odds are that you're wasting a large amount of time and effort trying to optimize something that results in no performance or functional gain in a modern OS.

devman · May 13, 2014

drakken said:
Ok basically this how it works.

The os quires the MBR what space is free on the logic drive. It does not know what type of physical hard ware is involved. The Master Boot Record (this is assuming you are not using grd or some other system) says I have these slots that are so big. The OS says the data is going to take up so many blocks of data, depending on smallest allowed slot. If it is one bit over it will still require one more slot. The first section of the logic table that fits all of them is used. If the table is setup to just throw them anywhere it will assign them each set of data to each slot as available in windows 7 this does not usually happen unless the drive is too full. Defrag is about putting those slots next to each other not necessarily putting everything at the front of the drive.

Once the OS has assigned the data to the logic table it sends the data to the drive controller. The controller logic or mini os for drive takes the data and places data based on info the OS does not have. Like bad sectors and so forth. All of this takes place in the ram on the drive which is why if you are swapping a lot of data around you will see the 32 meg and 64 meg drives have faster performance that is not accounted for the standard tests. But once the driver control writes what are called pointers to where the data is on the drive back to the MBR which is on windows 7 in the first 100 megs of the drive, the drive controller tells the OS that the MBR has been updated. The OS continues to the next logic drive request.

There are ways to hide info from the OS by bypassing the MBR but then you risk them getting overwritten as if it is not in the MBR there are operations that would fire off without informing the user that something is being written to the drive. Only if the square is considered to have data is the user consulted. What you want to do requires re-writing the drive controller and if you had trouble following this you have a bit of schooling to do before you are ready to tackle that.

MBR only deals with bootloading and partition structure, it knows nothing about free/used sectors or storing data.

A file is stored in a number of clusters/extents determined by the size of the file. The clusters/extents map to LBAs on the underlying block device, which could be a hard drive partition, a USB mass storage device, or anything else that can represented as a block device. How this mapping is done is up to the filesystem and does not necessarily have to be linearly. The individual LBAs are mapped to physical device blocks (sectors on an HDD) by the appropriate controller. The LBA to Physical sector mapping is completely opaque to the OS in modern storage hardware. The mapping may not even be 1-to-1 as in the case of 512e drives with 4k sectors.

cyclone3d · May 13, 2014

Ok, so you people are saying defraggers are basically useless in regards to moving stuff to the actual beginning of the drive?

In that manner of thinking, what would be the purpose of "short-striking" a drive by making a smaller boot partition in order to make it so the drive head doesn't have to move as far to access data when going from the beginning to the end of the partition.

Just because LBA is used, doesn't mean it is some random order. The addresses are sequential, which is completely logical if you think about it.

If you know the actual parameters of the drive, like how you used to have to know them back in the early IDE days to even set them up in the BIOS, it would be fairly trivial to translate exactly what sectors correspond to what LBA address.

If it was just randomly assigned, then there would be no way to even read data from the drive. It would be a total cluster (pun intended).

As for SSDs, they are still faster at sequential reads than random reads. I would highly suspect that they also are formatted in sequential order.

RAM is also addressed in sequential order. You don't have a system anywhere that arbitrarily assigns random addresses to different blocks of RAM. It just doesn't work that way.

devman · May 13, 2014

cyclone3d said:
Ok, so you people are saying defraggers are basically useless in regards to moving stuff to the actual beginning of the drive?

In that manner of thinking, what would be the purpose of "short-striking" a drive by making a smaller boot partition in order to make it so the drive head doesn't have to move as far to access data when going from the beginning to the end of the partition.

Just because LBA is used, doesn't mean it is some random order. The addresses are sequential, which is completely logical if you think about it.

If you know the actual parameters of the drive, like how you used to have to know them back in the early IDE days to even set them up in the BIOS, it would be fairly trivial to translate exactly what sectors correspond to what LBA address.

If it was just randomly assigned, then there would be no way to even read data from the drive. It would be a total cluster (pun intended).

As for SSDs, they are still faster at sequential reads than random reads. I would highly suspect that they also are formatted in sequential order.

RAM is also addressed in sequential order. You don't have a system anywhere that arbitrarily assigns random addresses to different blocks of RAM. It just doesn't work that way.

I dont' think anyone is saying that, I'm just trying to clear up misconceptions about storage abstractions.

Defaggers work on file systems not block devices, they will attempt to place files in sequential extents/clusters in a file system. Some may also attempt to put certain types of files in higher or lower addressed extents within the file system.

Short-stroking allows you to contain a file system to a low range of address (via partition management) with the idea that they will have better read/write performance.

Both of these operations make assumptions about the relation of logical to physical mapping of sectors. If the logical addressing of the drive mostly follows the physical layout of the drive, then they work. With HDDs, this is largely true with the exception of remapped sectors.

None of this is really on topic with what the OP wanted. Writing an LBA on a block device directly (i.e. bypassing the filesystem) can certainly be done. 'dd' on *nix system can do this easily and when doing so the file system will still see that LBA as 'free'. There is no guarantee, in the general sense, that a file system managing that LBA will use it last. One could possibly make comments about a specific file system with enough knowledge of the implementation. Some file systems will intentionally write LBAs in high addresses to store file system metadata.

raivo · May 14, 2014

Before anything, to be clear, I am writing about a simple PC with Windows installed, NTFS file system, a few hdd's and no complicated [hdd]hardware configurations involved, so please, argue inside these specs or not at all...

Way too much technical terms and theory, but what about real life? For example, hdtune benchmark test, how it can start reading disk from the outer side of the platters (reading speed is faster) and then slowly goes to the iner side of the platters (reading speed gradually decreases) as seen in a hdtune graph if OS has no control to heads position (not talking about specific strictly hardware controlled operations like reallocated, spare, or whatever you name it, sector maping and such)? Answering OP guestion, last overwritten HD space should be at the end of the platters where OS write files last. Want to argue? OK, let's do a real life test first:

1. Write hdd untill it is full and there is only space left for a test file. A usefull thing would be notice file address range with hdd utilities. Then, delete the test file and other files. Write again full hdd but leave empy space that is a bit more than a size of a file in test to be sure. Open file recovery software and try to recover the file (enter hdd address range you write down before to search). If the file can be recovered sucessfully, then there is no question - the last file overwritten is the place at the 'physical end' of the hdd (as reported to the OS or hdd utility).

2. Write hdd up to, let's say 50%, then write file in test, then write hdd up to 100% full. Notice file address with hdd utilities. Again, delete file in test and all other files. Write hdd a bit more than 50% full. Open file recovery software and try to recover the file. IF it is not possible, then we know exact order how OS writes files on hdd. That simple. To avoid casual coincidence, repeat test as my times as you want.

To answer OP's question, pretty safe way to be sure that a deleted file will be overwritten last, is to write a full hdd and then put a file in question as a very last file or use some specific system utility to do that. Well, at least, in a Windows OS.

evilsofa · May 14, 2014

If the filesystem is designed to choose to overwrite the overwritable block with the oldest data, then raivo's scheme would not help. By that, I mean that if you've been deleting files over a period of time, the filesystem may be choosing the next deleted file that will be overwritten by its age since deletion. in the interest of allowing the user a better chance to recover recently deleted data, and also to distribute wear and tear to all overwritable blocks instead of just a few.

I still think the OP is trying to optimize something that doesn't need to be optimized in a modern OS.

raivo · May 14, 2014

That's a valid argument. If the hdd space is already marked as previously deleted than it can be true if OS seeks and writes to older deleted files first but if hdd space is not marked, then OS will never touch the end of the hdd until it's full. So to avoid this, will it be enough to create an anallocated space at the end of the hdd just the right size for the particual file?

cyclone3d · May 19, 2014

raivo said:
That's a valid argument. If the hdd space is already marked as previously deleted than it can be true if OS seeks and writes to older deleted files first but if hdd space is not marked, then OS will never touch the end of the hdd until it's full. So to avoid this, will it be enough to create an anallocated space at the end of the hdd just the right size for the particual file?

Only if you have specialized software made to specifically write/read that space.

If you just have unallocated space, Windows itself is not going to write to or read from it.

If you do partitions and format them, you are going to have to take into account the space taken up by the file system.

this is still not going to give you an easy way to write a file that the OS is not able to see. You will still need specialized software to do this.

It might be possible to manipulate the Windows defrag API to do something like this, but I am not really sure as I have never had a reason to mess with it.

edit: The only possible use I see for this is to write malware that is virtually undetectable. What AV is going to scan "free space" on a drive? To top it off, the app that reads it wouldn't need to have any real malicious code in it. the only thing it would do is read the hidden data and then execute it. It would be very tricksy.

Edit 2: Here are a few links I found that talk about doing this sort of thing.

http://www.codeproject.com/Articles/28314/Reading-and-Writing-to-Raw-Disk-Sectors

http://superuser.com/questions/301968/how-wipe-one-sector-of-hard-drive-in-windows

http://stackoverflow.com/questions/2108313/how-to-access-specific-raw-data-on-disk-from-java

Possible to find what HD space gets overwritten last?

[H]ard|Gawd

[H]ard|Gawd

[H]F Junkie

Supreme [H]ardness

[H]ard|Gawd

[H]F Junkie

[H]ard|Gawd

[H]F Junkie

2[H]4U

[H]ard|Gawd

n00b

[H]ard|Gawd

[H]F Junkie

2[H]4U

[H]F Junkie

2[H]4U

n00b

[H]F Junkie

n00b

[H]F Junkie