My 2nd folding rig is not starting up at all. Just assembled. Symptoms: No Video, No sound from speaker during POST 6128's, 16 X 1 GB, Power switch and Reset switch work normal Fans all power up DP1 Led comes on solid, stays on for 30 seconds then fast blink DP3 Power Led never lights. Monitor was hooked up to first machine along with KB and Mouse New Modular power supply I need some direction to help me solve this problem, all equipment bought from another folder on Team 33. Any help would be most appreciated. Best regards, Charlie
When building such a big system it's always good practice to make it incremental process... so in situations like yours you don't go nuts when someone advises to... ...remove all CPUs and RAM except for the first CPU and its RAM. Ok, I'm not advising it just *yet*... it may come to that though. Also, very important, don't let the board rest on ESD bag. Make sure it's lifted a bit. Thick magazine works short-term, too; not sure if this applies to you. Make sure that JPW2 and JPW3 are connected. Reset CMOS (per the manual). With power completely removed from the board (PSU unplugged) short JPB1 contacts for 30 seconds (to be certain).
Removed Power supply plug, waited 2 min, removed battery, checked measures 3.03, shorted JBT1 pads for 45 seconds, put back together and nothing happened until I pushed power switch, unit came on with no change. Measured Power Supply all outputs are normal +12.14, -11.96, +3.41, +5.04, 750 watt single rail modular.Best, Charlie
tear, I forgot to answer about the mounting, I installed on a custom wire frame with 1" standoffs. JPW2 and JPW3 are connected, the power supply has dual CPU sockets.Thanks, Charlie So should I remove the CPU's and memory in reverse order 4, 3, 2 or should I remove everything but CPU 1 and it's memory 4 sticks in the blue sockets? I used musky's mod for the fans but I also used 1- 1/8" Socket head cap screws with a 7/64 ball driver for attachment, worked great for me and made it easy to install. Best, Charlie
No other ideas than to boot only with CPU1 and its RAM... I'm a bit tired though so I may not be thinking clearly... luckily, we have many 4P users
tear, I removed the #4 CPU and the board powered up and loaded Ubuntu off the HDD How do you recommend that I proceed? I have another set of 6128's that I could substitute for # 4. I assume that if other CPU will work in #4 that original CPU is bad, alternately if no other CPU will work in # 4 then the socket is bad? Thanks for all your help. Best regards, Charlie
I believe I remember some other threads about some troubleshooting steps, so I'm running off memory. I understand, the motherboard should be able to boot normally with just one CPU and one stick of ram. I'd suggest going to the minimum, until you have something to build on. If can't post, swap to another CPU, then a different stick of memory. I would print out the manual and check all the jumpers are set to the default positions. Best of luck.
Good to hear you have some progress. I'd isolate the CPU and see if you can boot. I'd remove all the cpus and put the CPU in question to CPU #1 with memory and see if you can boot.
Some times a cpu will not boot in 1 socket but it will in another socket but since you have extra cpu's laying around I would just put one of the others in that socket and see if it works. Also you may have a bad stick of memory you could remove all of the memory except 1 stick at socket #4 try and boot it see what happens if it does not boot try a different stick etc.etc.etc.
Sounds like a great plan of action, system is running as of now with 3 CPU's, but is reporting only 10 GB of memory, should be 12 GB. Thanks everyone for all the help. Best regards, Charlie
TIM residue on CPU pads or (worse) in a socket could be causing both your issues. I've seen cases of TIM "dropping" from CPU or LGA load plate into the socket...;always carefully check CPU pads and sockets w/used hardware. Bent socket pins are another (though unwelcome) possibility. I think thorough visual inspection is in order. You could also speculatively clean CPU pads with Q-tips + alcohol... Grandpa's suggestion of retrying CPU4 w/o its RAM is also worth trying (after visual inspection of CPU pads and the socket, ofc). ________________________________________________________________________ TPC should help you identify CPU w/failing DIMM: http://turionpowercontrol.googlecode.com/files/tpc-0.44-rc1-src.tar.gz Install it: Code: cd ~ wget http://turionpowercontrol.googlecode.com/files/tpc-0.44-rc1-src.tar.gz tar -xzf tpc-0.44-rc1-src.tar.gz cd tpc-0.44-rc1-src make sudo make install sudo cp /usr/bin/TurionPowerControl /usr/bin/tpc Run it: Code: sudo modprobe msr sudo modprobe cpuid sudo tpc -dram Look for Node + DCT that has a missing or failed LDIMM. Paste complete output when in doubt. Node/DCT map (assuming all CPUs are populated) below:
tear, it started to install and got down to the make: g++: Command not found I am not sure what this means, I'm only starting to learn Linux thinklet@SamsungSSD:~$ wget http://darkswarm.org/tpc-svn64-tear2.tar.gz --2012-07-22 06:43:12-- http://darkswarm.org/tpc-svn64-tear2.tar.gz Resolving darkswarm.org... 85.11.66.60 Connecting to darkswarm.org|85.11.66.60|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 102258 (100K) [application/x-gzip] Saving to: `tpc-svn64-tear2.tar.gz' 100%[======================================>] 102,258 121K/s in 0.8s 2012-07-22 06:43:14 (121 KB/s) - `tpc-svn64-tear2.tar.gz' saved [102258/102258] thinklet@SamsungSSD:~$ tar -xzf tpc-svn64-tear2.tar.gz thinklet@SamsungSSD:~$ cd tpc-svn64-tear2 thinklet@SamsungSSD:~/tpc-svn64-tear2$ make mkdir -p obj/x86_64 g++ -O2 -MMD -MF obj/x86_64/.TurionPowerControl.d -MT obj/x86_64/TurionPowerControl.o -c -o obj/x86_64/TurionPowerControl.o TurionPowerControl.cpp make: g++: Command not found make: *** [obj/x86_64/TurionPowerControl.o] Error 127 thinklet@SamsungSSD:~/tpc-svn64-tear2$ sudo make install [sudo] password for thinklet: Best, Charlie
thinklet@SamsungSSD:~/fah$ Run: sudo apt-get install g++ Run:: command not found thinklet@SamsungSSD:~/fah$
usr/bin/tpcsudo modprobe msr mv: target `msr' is not a directory thinklet@SamsungSSD:~/fah/tpc-svn64-tear2$ sudo modprobe cpuid thinklet@SamsungSSD:~/fah/tpc-svn64-tear2$ sudo tpc -dram sudo: tpc: command not found thinklet@SamsungSSD:~/fah/tpc-svn64-tear2$
thinklet@SamsungSSD:~/fah/tpc-svn64-tear2$ sudo make install install -ps TurionPowerControl /usr/bin thinklet@SamsungSSD:~/fah/tpc-svn64-tear2$ sudo mv /usr/bin/TurionPowerControl /usr/bin/tpcsudo modprobe msr mv: target `msr' is not a directory thinklet@SamsungSSD:~/fah/tpc-svn64-tear2$ sudo modprobe cpuid thinklet@SamsungSSD:~/fah/tpc-svn64-tear2$ sudo tpc -dram sudo: tpc: command not found
sudo mv /usr/bin/TurionPowerControl /usr/bin/tpcsudo modprobe msr ^^ two commands got combined here Redo them please: Code: sudo mv /usr/bin/TurionPowerControl /usr/bin/tpc Code: sudo modprobe msr
Turion Power States Optimization and Control - by blackshard - v0.43 DRAM Configuration Status Node 0 --- DCT0: memory type: DDR3 frequency: 1332 MHz Tcl=9 Trcd=9 Trp=9 Tras=24 Access Mode:1T Trtp=5 Trc=33 Twr=10 Trrd=4 Tcwl=7 Tfaw=20 TrwtWB=8 TrwtTO=7 Twtr=5 Twrrd=2 Twrwr=4 Trdrd=3 Tref=2 Trfc0=0 Trfc1=2 Trfc2=0 Trfc3=0 MaxRdLatency=52 LDIMM0=EMPTY/EMPTY LDIMM1=OK/EMPTY LDIMM2=EMPTY/EMPTY LDIMM3=EMPTY/EMPTY DCT1: memory type: DDR3 frequency: 1332 MHz Tcl=9 Trcd=9 Trp=9 Tras=24 Access Mode:1T Trtp=5 Trc=33 Twr=10 Trrd=4 Tcwl=7 Tfaw=20 TrwtWB=8 TrwtTO=7 Twtr=5 Twrrd=2 Twrwr=4 Trdrd=3 Tref=2 Trfc0=0 Trfc1=2 Trfc2=0 Trfc3=0 MaxRdLatency=52 LDIMM0=EMPTY/EMPTY LDIMM1=OK/EMPTY LDIMM2=EMPTY/EMPTY LDIMM3=EMPTY/EMPTY Node 1 --- DCT0: memory type: DDR3 frequency: 1332 MHz Tcl=9 Trcd=9 Trp=9 Tras=24 Access Mode:1T Trtp=5 Trc=33 Twr=10 Trrd=4 Tcwl=7 Tfaw=20 TrwtWB=8 TrwtTO=7 Twtr=5 Twrrd=2 Twrwr=4 Trdrd=3 Tref=2 Trfc0=0 Trfc1=2 Trfc2=0 Trfc3=0 MaxRdLatency=51 LDIMM0=EMPTY/EMPTY LDIMM1=OK/EMPTY LDIMM2=EMPTY/EMPTY LDIMM3=EMPTY/EMPTY DCT1: memory type: DDR3 frequency: 1332 MHz Tcl=9 Trcd=9 Trp=9 Tras=24 Access Mode:1T Trtp=5 Trc=33 Twr=10 Trrd=4 Tcwl=7 Tfaw=20 TrwtWB=8 TrwtTO=7 Twtr=5 Twrrd=2 Twrwr=4 Trdrd=3 Tref=2 Trfc0=0 Trfc1=2 Trfc2=0 Trfc3=0 MaxRdLatency=51 LDIMM0=EMPTY/EMPTY LDIMM1=OK/EMPTY LDIMM2=EMPTY/EMPTY LDIMM3=EMPTY/EMPTY Node 2 --- DCT0: memory type: DDR3 frequency: 1332 MHz Tcl=9 Trcd=9 Trp=9 Tras=24 Access Mode:1T Trtp=5 Trc=33 Twr=10 Trrd=4 Tcwl=7 Tfaw=20 TrwtWB=6 TrwtTO=5 Twtr=5 Twrrd=2 Twrwr=4 Trdrd=3 Tref=2 Trfc0=0 Trfc1=2 Trfc2=0 Trfc3=0 MaxRdLatency=64 LDIMM0=EMPTY/EMPTY LDIMM1=FAILED/EMPTY LDIMM2=EMPTY/EMPTY LDIMM3=EMPTY/EMPTY DCT1: memory type: DDR3 frequency: 1332 MHz Tcl=9 Trcd=9 Trp=9 Tras=24 Access Mode:1T Trtp=5 Trc=33 Twr=10 Trrd=4 Tcwl=7 Tfaw=20 TrwtWB=8 TrwtTO=7 Twtr=5 Twrrd=2 Twrwr=4 Trdrd=3 Tref=2 Trfc0=0 Trfc1=2 Trfc2=0 Trfc3=0 MaxRdLatency=51 LDIMM0=EMPTY/EMPTY LDIMM1=OK/EMPTY LDIMM2=EMPTY/EMPTY LDIMM3=EMPTY/EMPTY Node 3 --- DCT0: memory type: DDR3 frequency: 1332 MHz Tcl=9 Trcd=9 Trp=9 Tras=24 Access Mode:1T Trtp=5 Trc=33 Twr=10 Trrd=4 Tcwl=7 Tfaw=20 TrwtWB=8 TrwtTO=7 Twtr=5 Twrrd=2 Twrwr=4 Trdrd=3 Tref=2 Trfc0=0 Trfc1=2 Trfc2=0 Trfc3=0 MaxRdLatency=51 LDIMM0=EMPTY/EMPTY LDIMM1=OK/EMPTY LDIMM2=EMPTY/EMPTY LDIMM3=EMPTY/EMPTY DCT1: memory type: DDR3 frequency: 1332 MHz Tcl=9 Trcd=9 Trp=9 Tras=24 Access Mode:1T Trtp=5 Trc=33 Twr=10 Trrd=4 Tcwl=7 Tfaw=20 TrwtWB=8 TrwtTO=7 Twtr=5 Twrrd=2 Twrwr=4 Trdrd=3 Tref=2 Trfc0=0 Trfc1=2 Trfc2=0 Trfc3=0 MaxRdLatency=50 LDIMM0=EMPTY/EMPTY LDIMM1=OK/EMPTY LDIMM2=EMPTY/EMPTY LDIMM3=EMPTY/EMPTY Node 4 --- DCT0: memory type: DDR3 frequency: 1332 MHz Tcl=9 Trcd=9 Trp=9 Tras=24 Access Mode:1T Trtp=5 Trc=33 Twr=10 Trrd=4 Tcwl=7 Tfaw=20 TrwtWB=6 TrwtTO=5 Twtr=5 Twrrd=2 Twrwr=4 Trdrd=3 Tref=2 Trfc0=0 Trfc1=2 Trfc2=0 Trfc3=0 MaxRdLatency=60 LDIMM0=EMPTY/EMPTY LDIMM1=FAILED/EMPTY LDIMM2=EMPTY/EMPTY LDIMM3=EMPTY/EMPTY DCT1: memory type: DDR3 frequency: 1332 MHz Tcl=9 Trcd=9 Trp=9 Tras=24 Access Mode:1T Trtp=5 Trc=33 Twr=10 Trrd=4 Tcwl=7 Tfaw=20 TrwtWB=8 TrwtTO=7 Twtr=5 Twrrd=2 Twrwr=4 Trdrd=3 Tref=2 Trfc0=0 Trfc1=2 Trfc2=0 Trfc3=0 MaxRdLatency=51 LDIMM0=EMPTY/EMPTY LDIMM1=OK/EMPTY LDIMM2=EMPTY/EMPTY LDIMM3=EMPTY/EMPTY Node 5 --- DCT0: memory type: DDR3 frequency: 1332 MHz Tcl=9 Trcd=9 Trp=9 Tras=24 Access Mode:1T Trtp=5 Trc=33 Twr=10 Trrd=4 Tcwl=7 Tfaw=20 TrwtWB=8 TrwtTO=7 Twtr=5 Twrrd=2 Twrwr=4 Trdrd=3 Tref=2 Trfc0=0 Trfc1=2 Trfc2=0 Trfc3=0 MaxRdLatency=51 LDIMM0=EMPTY/EMPTY LDIMM1=OK/EMPTY LDIMM2=EMPTY/EMPTY LDIMM3=EMPTY/EMPTY DCT1: memory type: DDR3 frequency: 1332 MHz Tcl=9 Trcd=9 Trp=9 Tras=24 Access Mode:1T Trtp=5 Trc=33 Twr=10 Trrd=4 Tcwl=7 Tfaw=20 TrwtWB=8 TrwtTO=7 Twtr=5 Twrrd=2 Twrwr=4 Trdrd=3 Tref=2 Trfc0=0 Trfc1=2 Trfc2=0 Trfc3=0 MaxRdLatency=51 LDIMM0=EMPTY/EMPTY LDIMM1=OK/EMPTY LDIMM2=EMPTY/EMPTY LDIMM3=EMPTY/EMPTY Done. thinklet@SamsungSSD:~/fah/tpc-svn64-tear2$
Node 2 --- DCT0: memory type: DDR3 frequency: 1332 MHz Tcl=9 Trcd=9 Trp=9 Tras=24 Access Mode:1T Trtp=5 Trc=33 Twr=10 Trrd=4 Tcwl=7 Tfaw=20 TrwtWB=6 TrwtTO=5 Twtr=5 Twrrd=2 Twrwr=4 Trdrd=3 Tref=2 Trfc0=0 Trfc1=2 Trfc2=0 Trfc3=0 MaxRdLatency=64 LDIMM0=EMPTY/EMPTY LDIMM1=FAILED/EMPTY LDIMM2=EMPTY/EMPTY LDIMM3=EMPTY/EMPTY Node 4 --- DCT0: memory type: DDR3 frequency: 1332 MHz Tcl=9 Trcd=9 Trp=9 Tras=24 Access Mode:1T Trtp=5 Trc=33 Twr=10 Trrd=4 Tcwl=7 Tfaw=20 TrwtWB=6 TrwtTO=5 Twtr=5 Twrrd=2 Twrwr=4 Trdrd=3 Tref=2 Trfc0=0 Trfc1=2 Trfc2=0 Trfc3=0 MaxRdLatency=60 LDIMM0=EMPTY/EMPTY LDIMM1=FAILED/EMPTY LDIMM2=EMPTY/EMPTY LDIMM3=EMPTY/EMPTY
tear, I am assuming that since it shows 0-5 nodes this ='s 3 CPU's, does this mean the second memory slot has failed on CPU # 2 and #3?
Ow, that hurts. These correspond to P2_DIMM1A and P3_DIMM1A. You could try and reseat them or otherwise check for proper contact (foreign bodies in the slot or whatnot)... Is this board running [H] OCNG BIOS?
Yes, load optimal defaults in the BIOS, then reboot. If you see [H] logo on the screen, you're using custom BIOS.
I have a meeting that will last about 2 hours, the current fold will be finished also. If I reseat / replace chips in those two locations and it does not change does that mean next step clean pads and thoroughly check sockets? I think I was told that it has the stock Bios. What should I look for? Thanks again for all the help! Best regards, Charlie
I did not realise that you had sent the last message when I sent mine, is there anyway for the thread to update without refreshing or ?
tear, Sequence of events: At present system is operating with 3 CPU's and can see 12 GB Memory, able to boot 10.10 and run folding. I cleaned everything. Booted with # 1 CPU and 4 GB works Found TIM on pads of # 3 CPU, cleaned off with swab and alcohol Moved # 4 CPU (original would not boot) to socket 2 works reads 5 GB added 3 Gb reads 8 Moved CPU # 3 back to # 3 socket and 1 GB Booted but only read 8 not 9 GB Moved # 2 CPU to # 3 socket with 1 GB Booted reads 9GB added 3 GB reads 12 Moved # 3 CPU to Socket #4 Failed would not boot, It aapears there might be a pin in socket #4 that looks strange but only because it does not quite look like the others, old eyes! Moved a different 6128 to socket #4 Failed no boot Moved a 2nd 6128 to socket # 4 Failed no boot At this point it would appear that the board has a flaw in socket #4, The Bios is dated 10/28/2011 and appears to be original? Does SuperMicro repair these or is it cost effective? I do not know when this board was originally purchased. What would you suggest at this point? Board will limp along on 3 CPU's and 12 GB's, what type of folding would it be capable of? Will the OC mod still work? I need to explore all options. Thanks again for all your help, Best regards, Charlie Wow, what a learning experience!
Request a RMA from SM. They will get back without if the board is still in warranty or not and if it can be fixed. I have had very good experiences with SM customer service
Thanks, I have contacted SuperMicro about the procedure to check out the board, seemed like real nice folks to deal with. Since I live on the Central Coast of California, shipping to their office should be easy as they are in San Jose, Ca. At this point the board is functional but wounded. Best regards, Charlie