Seeming System Partition Corruption.

veterator

[H]ard|Gawd
Joined
Oct 26, 2005
Messages
1,909
This problem is bizarre, so I am posting here trying to see if it's a sign of a possible SSD drive failure that I've never heard of before.

This is an existing build, it's ran for ......3 years or so with one video card failure and no other major issues in that time aside from a weird powering off on reboot that SEEMED to be related to video hardware problem.

Last week, I turn it on and it says no boot device found. Only time it'll boot is off of another media device like the Windows 7 CD or a Ubuntu USB drive. Startup repair fails to find any issues, I tried it multiple times....tried to rebuild the BCD manually as well as with built in programs. Still no boot device found.

However all of the drives files are still intact on the Windows partition, so I copy all of them off under Ubuntu and do a complete fresh Windows 7 install (remove all the partitions on the SSD and let it recreate them and install). Windows runs fine for about 5 days, then it pops up with the boot device error again. Tried a lot of the same solutions, nothing worked. Was able to recover the files on the Windows partition again....move em off. Reinstall windows in the same manner.

Now I've updated the motherboard BIOS, updated some of the drivers especially related to SATA. But the timing of the corruption makes me think it's either a really bizarre SSD issue OR it's a Windows Update (possibly Windows 10) corrupting the boot device. I've been slowly adding updates to the newest W7 install to see if it repeats the issue, but there's hundreds and Windows Update seems to fail to find them about 90% of the time. It'll just sit there and "Checking for Updates..." forever, so it's rather time consuming to install all of these updates in the way Im doing it due to this.

The SSD in question is a Corsair Neutron 256GB drive.

Motherboard is an Asus Crosshair-V Formula Z, can't remember the processor off the top of my head.

32gbs of Corsair memory, which Im going to run through memtest when I head to bed tonight just to be safe.

VisionTek 7970 video card (I personally recommend no one buys visiontek cards if anyone reads this, I just don't like the way their RMA personnel has interacted with me at all in the past. They tried to deny me warranty replacement twice and my warranty on my video card changed multiple times in our conversations from 1 year to 3 years, etc. Just not happy with the experience at all.)

Has a secondary WD Black 1tb HD.

Like I said, system ran for 3 years with nothing like this happening before. I've never seen this happen on a computer before where the drive wasn't failing in another way, but the SSD seems to be fine to use once W7 is reinstalled. So perhaps something is damaged in the system partition and it keeps recreating it there, but you'd think it'd have issues right away and not days down the line after a reinstall. It's just been a couple weeks of computer problems in the family and not easy find the problem and replace the part kind of things either, so I hope there's someone out there who has some ideas on this.
 
What PSU?
Did you replace the SATA cable connecting the SSD to the motherboard?
 
What PSU?
Did you replace the SATA cable connecting the SSD to the motherboard?

It's a Corsair ax1200i power supply.

I have not replaced the cable, but can easily do that.

Other updates:

So far I have individually tested two of the four sticks with memtest86+ and both failed on one or two addresses, I am going to retest them to see if they fail on the same addresses which may indicate a bad slot on the motherboard. The motherboard has a lot of recent reviews on newegg complaining about dead boards, which makes me wonder if the board is having issues. Which would suck because it's a rather expensive motherboard and out of warranty by like 3 months.

Going to test the memory further in the board and in various slots to see if it fails in them all or not.


Ill replace the sata cable too next time I open the case to switch the memory around for further testing.
 
Replaced SATA cable.......tested third stick of memory in bank 1 and it tested bad on one address multiple times in test 8. Going to test fourth stick overnight. Only concern I have is memtest86+ is reporting the wrong CAS numbers for my ram, it's slower at 11-11-11, but at 28 in the fourth number where the stats on my memory is 10-11-10-30. Which makes me wonder if it's got the wrong settings or just reports them wrong. Either way it's a bizarre failure of memory too since they all seem to be a single address or two addresses and all fail in test 8, none of the other tests and not ALWAYS fail in test 8, sometimes they pass.
 
Fourth stick has tested good in slot 1 and 2 for 4-5 passes in memtest86+. So I think I've definitely got a bad set of memory on my hands with very infrequent/odd errors. Which may be causing my problem. Only way I can be kinda sure at this point is to get the rest of the windows updates (if it'll let me instead of spinning to check endlessly), and see if the system drive corrupts itself again on the one good stick of memory I have in the machine. Time consuming error at the very least, I'll just be glad if it's not the motherboard or CPU at this point...memory is a cheap RMA fix.

Oh and I plan on testing the good stick of memory in the last two slots on the motherboard just to be sure that it passes in both of those so I can rule out any bad memory slots on the board.

From there I guess see if the drive corrupts while I advanced RMA memory and perform memtest on all of them to check for defects before I rely on them. At least another week of testing off and on. So any other ideas or possibilities can be thrown in for testing too since I'd like to be sure of the drive not corrupting itself again.

Also read somewhere that bad memory can cause your machine to power down during bootup, which again is was infrequent like 25-40% of the time on a cold boot (first boot of the day) and it always threw video hardware errors in the windows log. Which I had to RMA a video card for failure as I said in the original. So MAYBE the memory is the culprit....hopefully it is since it can actually be tested.
 
Last edited:
Since I hate when these problem threads go without a final solution, here's what so far has happened and I THINK it's fixed. Only time will tell if it'll corrupt again due to another hardware issue.

So first, 3/4 memory sticks were bad in the computer. I am waiting on a replacement RMA for them still, and am using alternate memory in the meantime. One major issue that was plaguing me is that I had to reinstall Win7 probably 4-5 times, and Windows Update refuses to find updates after a certain point in updating, I had to just let it sit for hours while it downloaded the updates in the background and got them ready for installation on shutdown and even then it was sketchy as to when it would happen and how fast.

So if this happens to anyone else, I recommend you slipstream a Win7 install using NTLite with SP1 and the May 2016 Microsoft courtesy roll up update to speed things along. It's kind of a complicated process and none of the guides are completely up to date so it's a mix of finding a guide to create a slipstream ISO and modifying it to get the newer updates. I think I didn't do it completely correctly even but it was enough to get me to where I could mostly use Windows Update to finish on my final reinstall.


What I believe happened is that the memory was corrupting the updates it got from Windows Update in the background and when they were getting installed they would corrupt more, etc etc. Eventually it corrupted the system partition when it installed an update to dlls concerning I believe it was bitlocker. This ALSO triggered my Asus board to think Windows was compromised because of safe boot, and I had to disable it to get into Windows once I got a non-corrupted updated Windows install. So I can only speculate what that corruption plus the safe boot was ending up doing to the system, but it for sure was stopping me from seeing the message the boot up was putting out post the BIOS splash screen.

Overall this problem was not very bad in terms of hardware failure, but it was just time consuming as hell with all the memtest hours, reinstall hours, slipstream creation time, and waiting on Windows Update to do it's thing as slow as it is.

Now only time will tell if the system is stable and happy, and hopefully my replacement memory will be OK after another many hours of memtest ran on them before they ever get to the OS.

I hope this helps someone, or at least alerts them to a few possibilities to check, Asus "Safe Boot" is ultimately pointless since Win7 updates files that causes it to freak and memtest your memory where any error pretty much means you need to replace the memory. Corsair is pretty good about RMAs. Dealing with Asus on another computer's issues (believed motherboard failure) has made me appreciate Corsair's painless and fast RMA process. Hopefully other memory vendors are as painless.
 
Back
Top