View previous topic :: View next topic |
Author |
Message |
LIsLinuxIsSogood Veteran
Joined: 13 Feb 2016 Posts: 1179
|
Posted: Fri Jan 18, 2019 8:19 am Post subject: Nothing to do with Gentoo (My PSU is failing) |
|
|
HI,
First off, I apologize for a quesiton that really has nothing to do with Gentoo but I would like to ask it is equipment related and I don't have another convenient place to go with this, and I've seen similar issues arise on this forum here and get handled smoothly.
The issue I'm having is a power issue to the main board, which as everyone knows involves at least the two central components of Mobo and PSU.
Now a few years back I bought both of these second hand from someone that did provide a warning about some potential issues...too bad I don't recall which of these the warning actually was about!!! That might save time if I knew that now. I'm guessing it is PSU since those tend to fail, and would like to confirm that it did not also damage the motherboard in any way.
So far the noticeable questionable situation has been the connection where I've experienced every so often (over the course of several years) an issue with the PC sometimes refusing to power on, or powering on/off (rebooting in a cyclical fashion), and most of the time it is resolved by physically manipulating the cable at the connector location. By the way once the PC is turned on it almost always continues to work, except for several times (at least) when I've bumped into it hard enough to maybe have a similar connection issue occurring. (This is not a production host/server or anything like that.) But since machine tends to work most of the time I went ahead with testing the PSU, with instructions on how to do so coming form a tutorial I found on youtube. It involved turning on the PSU (by shorting green and black wires on the 24-pin connector male end) I was able to get the thing running and then take Voltage readings with a digital multimeter, which appeared questionable/unusual in terms of results and I suspect it to be a failed PSU for the following reason, which is the blue wire was running at 11.31V (a substantial portion off from 12V, and even more than 5% off or however much is considered acceptable, i don't know whether that is the case.)
Therefore I would like help to confirm the results about if my tests are accurate and what that would mean in terms of the damage assessed. If it is good reason to replace the PSU then that would be fine for me to proceed to do but if I should also consider replacing/buying a new board as well. That is just in case could there have been damage done to the motherboard at some point and how would I know that?! |
|
Back to top |
|
|
saboya Guru
Joined: 28 Nov 2006 Posts: 552 Location: Brazil
|
Posted: Fri Jan 18, 2019 12:54 pm Post subject: |
|
|
Not sure what the blue wire is for, but the 12v cables that actually power your devices are the yellow ones. |
|
Back to top |
|
|
Jaglover Watchman
Joined: 29 May 2005 Posts: 8291 Location: Saint Amant, Acadiana
|
Posted: Fri Jan 18, 2019 2:28 pm Post subject: |
|
|
Generally, PSU should be tested under load. In simplest case, hook it up to the motherboard, start the computer and measure voltages. The blue wire is unlikely used by your computer, modern motherboards have no use for -12 V. In any case, -12 V has allowed tolerance 10%, so yours is within limits. _________________ My Gentoo installation notes.
Please learn how to denote units correctly! |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54237 Location: 56N 3W
|
Posted: Fri Jan 18, 2019 6:58 pm Post subject: |
|
|
LIsLinuxIsSogood,
PC power supplies are 'switched mode' power supplies. Its good for size and efficiency.
Switched mode power supplies should not normally be operated at no load. A few have a tendency to self destruct when you do that.
All switched mode power supplies depend on some minimum load to regulate.
A PC power supply provides, +12v, +5v, +3.3v, 0v, -5v and -12v, with the negative voltages being optional on newer revisions of the ATX specification.
Only one output is actually regulated, either the +5v or +3.3v.
The +12v operates the HDD spin motors, which have servo controls anyway and powers the CPU core voltage regulator on the motherboard.
The +5v operates the HDD electronics and odds and ends on the motherboard.
The +3.3v operates the rest of the motherboard.
If you want to measure the PSU output voltages, do it carefully, while the PC operates.
There are several failure modes to look for but PSU (metal box) failures are usually rare, total and spectacular. You won't overlook it.
So your PSU is probably good.
Switch the PC off and remove the cover. You will want a good inspection lamp.
Unplug the auxiliary 12v connector at the motherboard. It has only Black (0v) and Yellow (+12v) wires.
Have a good look at the plastic parts and the pins. There should be no charring of the plastic on either half and the contacts should be bright Yellow. That's a very thin layer of gold.
Charring of the plastic indicates the connector has been getting hot. If that's present reconnect the connector. Take care its the right way round and 'waggle' the jointed connector in an attempt to reduce the contact resistance. There is about 10A or so carried by that connector, so the contact resistance must be low to avoid heating.
This sort of problem shows up mostly under high CPU loads as the actual current is related to the CPU load.
Look at the region around the CPU. You may see 10 to 20 cylindrical objects. They should all be the same, fitted flat to the motherboard, with no jelly leaking out.
Do not touch the jelly if its there. It may be one of several unpleasant materials. Bulging tops and/or leaking contents show that the CPU core voltage regulator has failed.
These things can be replaced, a) if you can get them and b) if you are moderately skilled in the use of a soldering iron.
They all need to be changed if even one has failed. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2007
|
Posted: Sat Jan 19, 2019 11:56 am Post subject: |
|
|
I disagree that PSU failures are rare. Old PSUs seem to loose the ability to provide all the stated current. I've had two fail gracefully.
I've just replaced the one that came with my 10-year old desktop. Admittedly, it was rated at about 215W and with all the disks it was drawing 205W (or something similar, I forget). The desktop's symptoms were (a) the disks took about 30 seconds to spin up - if I hit enter on the Grub boot menu too soon, Grub failed to find stage 2, and (b) once booted, plugging a drive into my USB-3 hub caused a click as one drive reset itself, complete with messages in syslog. A new 750W power supply (they were out of the cheapest 500W ones) cured it. (and meant I could install a gee whiz graphics card with more compute power than the desktop it's attached to!) _________________ Greybeard |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54237 Location: 56N 3W
|
Posted: Sat Jan 19, 2019 3:46 pm Post subject: |
|
|
Goverp,
In the interests of keeping things simple, I glossed over rated power output and useful power output.
There are limits on the total output power and separately, on some combinations of output voltages. You need to stop before you hit the first limit.
I've glossed over HDD stalled motor currents too. The kernel SCSI stack has had a feature for a long time to avoid all the HDD spinning up at the same time and embarrassing the PSU with the spin up current demand.
What you describe sounds like expected behaviour from an intermittently overloaded PSU.
I agree that PSU output quality gets worse as a PSU ages. Ripple in particular gets measurably worse. I've not had any like that interfere with normal operation but I tend to derate my PSUs at new, as I know I will connect more stuff throughout the like of the system. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
LIsLinuxIsSogood Veteran
Joined: 13 Feb 2016 Posts: 1179
|
Posted: Sun Jan 20, 2019 2:41 am Post subject: |
|
|
So unfortunately I am having a hard time to determine the course of action...I will start by replacing the PSU. But then does it makes sense to wait to test the motherboard until I've put in the new PSU? With all the detailed information provided in the post I don't think anybody said that I have to purchase a new PSU, which is what I'm asking now of course. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54237 Location: 56N 3W
|
Posted: Sun Jan 20, 2019 12:57 pm Post subject: |
|
|
LIsLinuxIsSogood,
If you have, or can borrow, a PSU to test with, do it. Don't spend money yet.
Look at the connectors and Vcore regulator. That's only a visual exam.
Post images if you want a second opinion. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Ashie n00b
Joined: 09 Apr 2016 Posts: 54
|
Posted: Mon Jan 21, 2019 7:51 pm Post subject: |
|
|
Do inspect all capacitors on the mainboard and inside the PSU for signs of bulging or leaking. If any found, there is your problem
Caution with the PSU : The PSU contains one or two large capacity capacitors at its primary, which are charged at up to 320VDC (in PSU's without PFC) or 450VDC (in PSU's with PFC) whenever the PSU is plugged into the wall, and can stay charged for long periods after it is unplugged. They can give quite big electric shock if you touch any internal conductors of the PSU's primary side (component leads and on the underside of the PCB). There is no danger from just removing the cover to take a look, but don't poke a finger in there
Many modern (approx Core ix era, some of the last Core 2 boards too) mainboards have solid Aluminum electrolytic capacitors. Those don't tend to fail as often and if you have those, most likely they are intact. They can be identified by lack of rupture lines (shaped like "X" "Y" "K" etc) pressed into their tops. However it seems to me that modern mainboards are plagued with other reliability problems (not capacitor related), which were not as common with older boards
The output voltages of a PC power supply can go wrong not only in terms of presence/absence or measured average value, but also in terms of ripple. Intact power supply will keep ripples to a minimum. A failing power supply can put out excessive ripple, which can do anything from random failures to no POST to hardware damage, while still being invisible to a multimeter measurement of DC voltage
Some power supplies will put out high ripple just because of poor quality of the power supply itself, even when brand new. They get worse as they age. Those power supplies tend to be included for free with PC cases or be sold at PC shops as the cheapest option for a PS. If you post a picture of your power supply with its cover off, it will be possible to tell if it's one of them
A failing ATX12V connector won't necessarily have effect on performance... Thing is, it feeds the CPU VRM on the mainboard - a buck converter that steps the 12V at approx 10A down to some 1.2V at approx 100A. A buck converter can keep working happily even when it's input voltage is a little low, unless some extra effort is made to detect this condition
A couple Watts of resistive loss is enough to melt the connector. That would be approx. 0.2V at 10A. 11.8V is not only sufficient for the VRM to work, it's even still within ATX spec. That is, the VRM should not detect this condition as a fault at all, even if all protections are in place
Twenty Watts of resistive loss is probably enough to set the connector on fire. That would be approx 1.6V at 12A. (higher current draw since the buck tries to make up for the lower input voltage). 10.4V is out of spec and ideally there gotta be an undervoltage protection to shut down the VRM, but if there isn't one, the VRM itself is capable of continuing to work just fine even in this condition....
Just look if there are any signs of heating on the connector, not hard to spot if it's been going on for a while
As you mentioned having to play with the power connector to get the PC going, i think you might be having an entirely different problem
The big chips of the mainboard chipset (MCH, ICH) are soldered to the board with an array of tin balls (BGA soldering). Those tend to fail as result of combination of mechanical stress on the board (as result of flexing when inserting RAM sticks, connectors, pressure applied by the CPU cooler, etc), heating/cooling cycles (especially if overheating) which again translate into mechanical stress, and sometimes iffy soldering quality. (The lead free solder, while not the root cause of the problem, is much more susceptible to failure from all those causes compared to leaded solder, so can be considered as a contributing factor)
When a solder point in a BGA had failed, it means a solder point that intermittemntly loses contact. It might lose contact or start contacting again whenever it heats above a certain temperature, whenever the board is flexed, and such. It is possible, that you have a failing solder point on the mainboard, which you happen to get to touch again every time you flex the board by playing with the power connectors. It will progressively fail until this won't help anymore
Such fault is repairable but it requires some moderate or serious messing with the board (repair by reflow or complete replacement of the solder balls, respectively) |
|
Back to top |
|
|
LIsLinuxIsSogood Veteran
Joined: 13 Feb 2016 Posts: 1179
|
Posted: Tue Jan 22, 2019 9:03 am Post subject: |
|
|
https://imgur.com/a/kpO9s2l
These are the specs that written on the back of the PSU. Taking it apart seems sort of like a last resort (I feel).
The connector apears in the images as well, but I can't see any problems with that.
As for the motherboard and the explanation provided, what ways are there to test the PSU, anything Or should I just go about replacing the PSU and see if the problem persists or not thereby determining if something "bigger" could be wrong with the board.
UPDATE:
I'm getting more since I decided to boot the machine again with the same components, and while BIOS were reset (no biggie there) However some noise seems to be coming at the start immediately at the same time the power is provided to board/peripherals. It sounds like it could be specific to one of the drives, so I could disconnect each one and one at a time find out if one is causing that. I would really prefer to not have the drives fail until I've been able to maybe access the disks for copying data off of if they still work. What is the preferred strategy for testing a drive...testing it could that cause it to fail even just like I assume any activity involving reading or writing may cause damage? Also a faint beep is heard after the loud scratchy/crashy noise of the disk. What does that mean?
REUPDATE:
Nevermind about the loud noise I totally forgot that I had placed another drive in there that never belonged in there, when I went to check what was causing the noise it was the first one I disconnected. |
|
Back to top |
|
|
Ashie n00b
Joined: 09 Apr 2016 Posts: 54
|
Posted: Tue Jan 22, 2019 10:09 am Post subject: |
|
|
This is a good quality PSU. I think as long as it powers on at all, provides all right voltages (in their average values i.e. what you can measure on multimeter), and there aren't puffed capacitors, it is unlikely to have other problems
To test whether it powers on (if it doesn't with the mainboard), connect the AC input, and short the PS-ON Green wire in the 24 pin connector (4th from the side, Orang-Blue-Black-Green) to one of the Earth Black wires. The fan gotta spin, and you gotta be able to measure the right voltages on the outputs. (This is not indication that the PS is fully intact, but it does show that all the switching and rectification components are working)
+12V (Yellow)
+5V (Red)
+3.3V (Orange)
-12V (Blue... This voltage is not essential for PC power up)
+5V standy (Violet, gotta be 5V whether PS is on or off)
PowerGood (Grey, gotta be at 5V when PS is on. This is an internal "PS is OK" signal)
To test for whether capacitors are intact, open and inspect (can't do much else without having loads and oscilloscope to test it electrically). It is not perfectly accurate, but good indication (capacitors can fail without visible damage, but it is rare for the type of capacitors used here)
No signs of heating means your connector is ok
This is as much as you can test without replacing a PS and without having more specialized instruments to test this one. If you take another PS for the test, make sure that other one is not failing either.. |
|
Back to top |
|
|
Ashie n00b
Joined: 09 Apr 2016 Posts: 54
|
Posted: Tue Jan 22, 2019 10:39 am Post subject: |
|
|
If you have failing hard drives, leave them disconnected until you are ready to actually backup them. All the extra spin up/down cycles aren't doing them good...
An intact hard drive will spin up in a single go, and then will make some more sounds of head actuator going in different places for a few seconds, and then stay quiet (just spinning) untill accessed by OS. Some drives have an actuator locking mechanism that will make an audible click when it releases, this happens at the same time with spin up. Some hard drives have loud-ish spin up, but it's hard to tell if it is a bad sound without hearing it myself
Sounds of a bad hard drive are - no sound at all (no spin up), trying to spin up more than once (spin up, slow down or stop, spin up again), repeated clicks, or continued head actuator repositioning sounds for long after the first few seconds while still not being accessed by OS (for example, if you stay in the BIOS setup screen)
The best strategy for testing a hard drive is to back it up right away
You can use dd to dump the entire drive to a file on another (bigger) drive :
Code: | dd if=/dev/sdb of=/home/ash/entire_drive_backup |
or backup a partition of interest :
Code: | dd if=/dev/sdb1 of=/home/ash/partition_backup |
And by copying files out the normal way, and if it fails, restart the copying while excluding the directory in which it met bad blocks. (later return to there to try to copy more stuff from it, normally it is only a few single files that will be lost even if the drive got badblocks that couldn't be remapped)
For a quick estimate of the drive's health before putting efforts into backup (and without making the drive work hard with some "manufacturer's test tools", which could push a failing drive over the edge), use smartctl (from package smartmontools) Here is an example from a not so good drive in one of my boxes
Code: |
# smartctl --all -s on /dev/sda
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.14.61-gentoo] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Caviar Blue Serial ATA
Device Model: WDC WD800AAJS-55PSA0
Serial Number: WD-WMAP92157008
LU WWN Device Id: 5 0014ee 0556cea9b
Firmware Version: 05.06H05
User Capacity: 80,025,280,000 bytes [80.0 GB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA/ATAPI-7 (minor revision not indicated)
Local Time is: Tue Jan 22 12:28:36 2019 IST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF ENABLE/DISABLE COMMANDS SECTION ===
SMART Enabled.
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 1860) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 28) minutes.
Conveyance self-test routine
recommended polling time: ( 6) minutes.
SCT capabilities: (0x103f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 36731
3 Spin_Up_Time 0x0003 162 158 021 Pre-fail Always - 2883
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 705
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x000e 200 200 051 Old_age Always - 0
9 Power_On_Hours 0x0032 044 044 000 Old_age Always - 41083
10 Spin_Retry_Count 0x0012 100 100 051 Old_age Always - 0
11 Calibration_Retry_Count 0x0012 100 100 051 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 626
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 174
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 878
194 Temperature_Celsius 0x0022 111 093 000 Old_age Always - 32
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0012 200 196 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 200 196 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 199 051 Old_age Offline - 0
SMART Error Log Version: 1
ATA Error Count: 6831 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 6831 occurred at disk power-on lifetime: 38934 hours (1622 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 04 e3 ad ad e0 Error: UNC at LBA = 0x00adade3 = 11382243
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
40 00 04 e3 ad ad 00 00 00:05:33.373 READ VERIFY SECTOR(S)
c8 00 01 00 00 00 00 00 00:05:33.373 READ DMA
40 00 04 df ad ad 00 00 00:05:30.434 READ VERIFY SECTOR(S)
40 00 04 db ad ad 00 00 00:05:27.350 READ VERIFY SECTOR(S)
c8 00 01 f0 62 a9 03 00 00:05:27.350 READ DMA
Error 6830 occurred at disk power-on lifetime: 38934 hours (1622 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 04 df ad ad e0 Error: UNC at LBA = 0x00adaddf = 11382239
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
40 00 04 df ad ad 00 00 00:05:30.434 READ VERIFY SECTOR(S)
40 00 04 db ad ad 00 00 00:05:27.350 READ VERIFY SECTOR(S)
c8 00 01 f0 62 a9 03 00 00:05:27.350 READ DMA
40 00 04 d7 ad ad 00 00 00:05:24.694 READ VERIFY SECTOR(S)
c8 00 01 00 00 00 00 00 00:05:24.694 READ DMA
Error 6829 occurred at disk power-on lifetime: 38934 hours (1622 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 04 db ad ad e0 Error: UNC at LBA = 0x00adaddb = 11382235
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
40 00 04 db ad ad 00 00 00:05:27.350 READ VERIFY SECTOR(S)
c8 00 01 f0 62 a9 03 00 00:05:27.350 READ DMA
40 00 04 d7 ad ad 00 00 00:05:24.694 READ VERIFY SECTOR(S)
c8 00 01 00 00 00 00 00 00:05:24.694 READ DMA
40 00 04 d3 ad ad 00 00 00:05:22.072 READ VERIFY SECTOR(S)
Error 6828 occurred at disk power-on lifetime: 38934 hours (1622 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 04 d7 ad ad e0 Error: UNC at LBA = 0x00adadd7 = 11382231
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
40 00 04 d7 ad ad 00 00 00:05:24.694 READ VERIFY SECTOR(S)
c8 00 01 00 00 00 00 00 00:05:24.694 READ DMA
40 00 04 d3 ad ad 00 00 00:05:22.072 READ VERIFY SECTOR(S)
40 00 04 cf ad ad 00 00 00:05:19.433 READ VERIFY SECTOR(S)
c8 00 01 00 00 00 00 00 00:05:19.433 READ DMA
Error 6827 occurred at disk power-on lifetime: 38934 hours (1622 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 04 d3 ad ad e0 Error: UNC at LBA = 0x00adadd3 = 11382227
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
40 00 04 d3 ad ad 00 00 00:05:22.072 READ VERIFY SECTOR(S)
40 00 04 cf ad ad 00 00 00:05:19.433 READ VERIFY SECTOR(S)
c8 00 01 00 00 00 00 00 00:05:19.433 READ DMA
40 00 08 e7 ad ad 00 00 00:05:16.638 READ VERIFY SECTOR(S)
c8 00 01 00 00 00 00 00 00:05:16.638 READ DMA
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
|
Look in the table at the following attributes and RAW_VALUE column :
1 Raw_Read_Error_Rate
5 Reallocated_Sector_Ct
7 Seek_Error_Rate
10 Spin_Retry_Count
11 Calibration_Retry_Count
196 Reallocated_Event_Count
197 Current_Pending_Sector
198 Offline_Uncorrectable
199 UDMA_CRC_Error_Count
200 Multi_Zone_Error_Rate
(some may be or not be present depending on drive manufacturer)
The most critical are 5, 196, 197
The "Error 6831 occurred at...." blocks below are there only for a drive that had some errors happen. In a good perfect drive (that haven't failed iteslf, and hadn't seen errors caused by a failing mainboard etc either), there won't be any of that in the output |
|
Back to top |
|
|
LIsLinuxIsSogood Veteran
Joined: 13 Feb 2016 Posts: 1179
|
Posted: Tue Jan 22, 2019 10:54 am Post subject: |
|
|
Thanks for the info about smartctl it was already installed so just ran it on three drives and all seem healthy.
I guess other than the potential for continual PSU/Motherboard failures (which are intermittent) so I will go on about my business as usual in the hopes that no damage is going to occur to the machine. Although it really only the data on disk that I care about, which is why I will now prioritize the backups to have them stored elsewhere like an External HD. Thanks to everyone for the suggestions. I will be sure to be looking for a good PSU to replace this one. Although as Ashie says I'm not sure that is actually going to fix the problem since it could be more electrical issues in the board that's causing the malfunction. |
|
Back to top |
|
|
Ashie n00b
Joined: 09 Apr 2016 Posts: 54
|
Posted: Tue Jan 22, 2019 11:30 am Post subject: |
|
|
If the PSU is cleared by further testing, i'd not consider it as being bad or needing replacement, atleast for non mission critical machine
PS. Have you tried to wiggle the RAM, Video card, reseat the CPU in it's socket (also look for bent pins in the socket) ? (beware that if it really is failing soldering points, the stress from taking off and reinstalling the CPU heatsink can push the board over the edge) |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54237 Location: 56N 3W
|
Posted: Tue Jan 22, 2019 12:00 pm Post subject: |
|
|
LIsLinuxIsSogood,
That's a good choice of PSU. The specification says it has a 1 x 4+4pin CPU +12V power connector, not shown in your images.
That's the important one as it supplies the power to the CPU, RAM on so on, via the regulator on the motherboard.
As your PSU generates two separate +12v supplies, one will be used for the CPU and one for everything else that needs 12v.
You can probably see enough of the insides to see failing capacitors without taking the cover of, so don't do that. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|