View previous topic :: View next topic |
Author |
Message |
whiskeypriest Tux's lil' helper
Joined: 05 Feb 2004 Posts: 91
|
Posted: Tue Oct 12, 2004 5:17 am Post subject: |
|
|
First things first: I apologize for not getting back to this sooner. The weekend got away from me, and it wasn't until I read the latest post that I realized I'd forgotten to check my own system for this problem.
Secondly, I'd like to thank you (ectospasm) for bringing this to my attention. Not only does this seem to be a problem for apcupsd-3.10-15, it also seems to be a problem for all Gentoo versions back through 3.10.10-r2. As I previously mentioned, when I test new iterations of apcupsd I generally set the TIMEOUT value to 30 and if this abbreviated power failure routine ran according to expectations, I considered all to be well. My fault for not running a real simulation; that was extremely sloppy.
As you may have already gathered, I've run a series of full power failure tests this evening: without exception, apcupsd-3.10.13 and apcupsd-3.10.10-r2 both initiated a shutdown well before (i.e. less than thirty seconds after pulling the plug) the battery had been exhausted, though both listed battery exhausted in apcupsd.events as the reason for shutdown. I'll go out on a limb and say that apcupsd-3.10.15 performs similarly from what you've already reported.
For the time being, apcupsd-3.10.10-r1 is the latest version which seems to handle power failures as expected. I just completed a full power failure test using this iteration while keeping a close eye on my logs and the reports from my central monitoring station. Both the battery charge and the remaining runtime decreased/increased as expected while the battery was under load and I varied the strain on the system. At precisely three-minutes-remaining on my central monitor, the system initiated a successful shutdown. In short, apcupsd-3.10.10-r1 did what it was supposed to.
ectospasm, can I ask you to downgrade your installation to 3.10.10-r1 in order to confirm that this version operates as it should for your system parameters? So far as I'm aware, there should be no major security/functionality issues involved in downgrading (he said, crossing his fingers).
In the meantime, I'd encourage anyone following along at home to run a full power failure test and post back here if they're experiencing the problems detailed above. I'd like to get this issue confirmed before going to the apcupsd-users list or updating the main document.
I realize this doesn't fix the current problem, but to be honest, my first concern is keeping everyone on their proverbial feet and confirming the universality of this issue before troubleshooting what's happened in the last few iterations.
Thanks again for your patience and vigilance. |
|
Back to top |
|
|
ectospasm l33t
Joined: 19 Feb 2003 Posts: 711 Location: Mobile, AL, USA
|
|
Back to top |
|
|
ectospasm l33t
Joined: 19 Feb 2003 Posts: 711 Location: Mobile, AL, USA
|
|
Back to top |
|
|
whiskeypriest Tux's lil' helper
Joined: 05 Feb 2004 Posts: 91
|
Posted: Wed Oct 13, 2004 12:10 am Post subject: |
|
|
I remember this being an issue back in April when 3.10.10-r1 first hit, but I assume you've synced since then (i.e. I thought the problem with the ebuild had been corrected).
Looked at the old thread, didn't find much useful.
My procedure for downgrading was as follows: I backed up apcupsd.conf and hosts.conf, then unmerged 3.10.13 (my version). Removed the entire /etc/apcupsd directory. Masked all versions beyond 3.10.10-r1, then emerged same. Replaced both .conf files, started the daemon, and everything ran fine.
Any of this of use? Perhaps another unmerge/emerge is warranted.
Let me know how it goes. |
|
Back to top |
|
|
ectospasm l33t
Joined: 19 Feb 2003 Posts: 711 Location: Mobile, AL, USA
|
Posted: Wed Oct 13, 2004 4:35 am Post subject: |
|
|
I tried emerging both available versions of 3.10.10 (-r1 and -r2) and neither of them built the binaries. There are some errors that don't seem related to the bugs mentioned in the HOWTO thread. I would prefer to have the latest stable release running anyway; re-emerging 3.10.15-r1 solved this latest problem.
I'll see if the problem of shutting down prematurely occurs in Windows. I hope I can keep some sort of log there with PowerChute. _________________ Join the adopt an unanswered post initiative today
Join the EFF!
Join the Drug Policy Alliance! |
|
Back to top |
|
|
whiskeypriest Tux's lil' helper
Joined: 05 Feb 2004 Posts: 91
|
Posted: Wed Oct 13, 2004 3:49 pm Post subject: |
|
|
Ye gods...what a mess. Apologies for making you jump through the emerge-hoops for nothing, but I completely understand: something (i.e. fifteen minutes of runtime plus graceful shutdown) is better than nothing.
For completeness' sake, I emerged 3.10.15-r1 this morning and ran a power failure test...the same premature shutdown results as with 3.10.13 and 3.10.10-r2. I'm now basically out of ideas on this one; it sounds like you've got enough to weather a brief outage, so I'll leave it to your discretion whether to wait for things to improve in the next version or to take your problem to the apcupsd-users list. Please let me know what you find/decide regardless.
Having heard nothing else about the premature shutdown issues from the community, I'm still undecided about advocating the downgrade (especially in light of your experiences). I need to do some further testing here to see if I can replicate the problem on my other systems (2.4 vs. 2.6 kernels, varying APC models). I'll post back with the results later today. |
|
Back to top |
|
|
whiskeypriest Tux's lil' helper
Joined: 05 Feb 2004 Posts: 91
|
Posted: Thu Oct 14, 2004 4:29 am Post subject: |
|
|
So after upgrading my 2.6 kernels and lots of power failure testing, everything now performs as expected. 2.4 kernels never experienced any problems. The light at the end of the tunnel?
Here's what I'm running:- P4 2.53GHz, SiS chipset, gentoo-dev-sources-2.6.8-r8, Back-UPS XS 800, apcupsd-3.10.13, standalone configuration: performs as expected.
- P3 600MHz, Intel chipset, gentoo-dev-sources-2.6.8-r8, Back-UPS XS 800, apcupsd-3.10.13, standalone configuration: performs as expected.
- Celeron 533MHz, VIA chipset, gentoo-sources-2.4.25_pre7-r11, Back-UPS RS 1000, apcupsd-3.10.13, netslave configuration: performs as expected.
- Celeron 733MHz, Intel chipset, gentoo-sources-2.4.25_pre7-r11, Back-UPS RS 1000, apcupsd-3.10.13, netmaster configuration: performs as expected.
Makes me wonder if the problem was with apcupsd to begin with. How are things in your world? |
|
Back to top |
|
|
ectospasm l33t
Joined: 19 Feb 2003 Posts: 711 Location: Mobile, AL, USA
|
Posted: Thu Oct 14, 2004 4:00 pm Post subject: |
|
|
I'm upgrading to gentoo-dev-sources-2.6.8-r9... I've been running 2.6.7-r11 since the VFAT support in every 2.6.8 kernel I've tried has been severely broken. Hopefully they've fixed the problem, and in the process fixed my premature shutdown problem.
I won't be able to test it until later this weekend. I've got a program due tomorrow and I need to get it done. _________________ Join the adopt an unanswered post initiative today
Join the EFF!
Join the Drug Policy Alliance! |
|
Back to top |
|
|
whiskeypriest Tux's lil' helper
Joined: 05 Feb 2004 Posts: 91
|
Posted: Thu Oct 14, 2004 5:27 pm Post subject: |
|
|
Roger that.
Best of luck with all of it...let me know how it turns out when you have the time. |
|
Back to top |
|
|
whiskeypriest Tux's lil' helper
Joined: 05 Feb 2004 Posts: 91
|
Posted: Mon Oct 25, 2004 10:14 pm Post subject: |
|
|
Still out there, ectospasm? Anything improved with gentoo-dev-sources-2.6.9-r1/apcupsd-3.10.15-r1? |
|
Back to top |
|
|
ectospasm l33t
Joined: 19 Feb 2003 Posts: 711 Location: Mobile, AL, USA
|
Posted: Tue Oct 26, 2004 7:39 pm Post subject: |
|
|
Not yet... I haven't been able to upgrade my kernel past 2.6.7-gentoo-r11 because of the VFAT support; I wish I didn't need it but I do. Anyway, a few days ago I tried a power fail test (unplugged the UPS from the wall) while I was in Windows, running APC PowerChute Personal Edition. Unfortunately that software doesn't allow for timestamps or logging, so I couldn't get an exact time, but it seemed to perform as expected; it gave me at least 45min if not an hour of uptime with the power off, and this is all on the same hardware. Trying the test back in Gentoo gave the following results:
Code: | Tue Oct 26 14:15:19 CDT 2004 Power failure.
Tue Oct 26 14:15:25 CDT 2004 Running on UPS batteries.
Tue Oct 26 14:23:49 CDT 2004 Reached remaining time percentage limit on batteries.
Tue Oct 26 14:23:49 CDT 2004 Initiating system shutdown!
Tue Oct 26 14:23:49 CDT 2004 User logins prohibited
Tue Oct 26 14:23:49 CDT 2004 BCHARGE : 091.0 Percent
Tue Oct 26 14:23:49 CDT 2004 TIMELEFT : 0.0 Minutes
Tue Oct 26 14:23:51 CDT 2004 apcupsd exiting, signal 15
Tue Oct 26 14:23:51 CDT 2004 apcupsd shutdown succeeded
|
As you can see it didn't even last 10min today. Hopefully this is kernel related, and updating to the newest kernel will solve all my problems. _________________ Join the adopt an unanswered post initiative today
Join the EFF!
Join the Drug Policy Alliance! |
|
Back to top |
|
|
ectospasm l33t
Joined: 19 Feb 2003 Posts: 711 Location: Mobile, AL, USA
|
Posted: Wed Oct 27, 2004 4:24 am Post subject: |
|
|
I fixed my kernel problems, but it only marginally helped the apcupsd problem:
Code: | Tue Oct 26 22:12:16 CDT 2004 Power failure.
Tue Oct 26 22:12:22 CDT 2004 Running on UPS batteries.
Tue Oct 26 22:40:09 CDT 2004 Reached remaining time percentage limit on batteries.
Tue Oct 26 22:40:09 CDT 2004 Initiating system shutdown!
Tue Oct 26 22:40:09 CDT 2004 User logins prohibited
Tue Oct 26 22:40:09 CDT 2004 BCHARGE : 071.0 Percent
Tue Oct 26 22:40:09 CDT 2004 TIMELEFT : 0.0 Minutes
Tue Oct 26 22:40:11 CDT 2004 apcupsd exiting, signal 15
Tue Oct 26 22:40:11 CDT 2004 apcupsd shutdown succeeded
|
It's better than my last post, but still much worse than I expected. I'm about to go to apcupsd-users. _________________ Join the adopt an unanswered post initiative today
Join the EFF!
Join the Drug Policy Alliance! |
|
Back to top |
|
|
whiskeypriest Tux's lil' helper
Joined: 05 Feb 2004 Posts: 91
|
Posted: Wed Oct 27, 2004 5:20 am Post subject: |
|
|
Yeah...I was already grasping at straws, and the version increase was basically the last I had to offer on your problem.
I sincerely hope someone at apcupsd-users can get you an answer...please let me know what you find, and good luck with it. |
|
Back to top |
|
|
ectospasm l33t
Joined: 19 Feb 2003 Posts: 711 Location: Mobile, AL, USA
|
|
Back to top |
|
|
ectospasm l33t
Joined: 19 Feb 2003 Posts: 711 Location: Mobile, AL, USA
|
Posted: Fri Oct 29, 2004 6:34 am Post subject: |
|
|
Here's the answer I got on apcupsd-users (from an apcupsd developer):
Adam Kropelin wrote: | I found my RS 1500 units needed a runtime calibration before the remaining time estimate was even close. Disable apcupsd, put a load of 50% or more on the UPS, and pull the plug. Let the batteries run completely dry, then plug it back in and let it recharge. After that the remaining time estimate should be much more accurate.
I suspect the reason you are seeing TIMELEFT at 0.0 is that it does not scale down nice and linearly, especially when the calibration is way of. It may drop from > 3.0 down to 0.0 in the course of a few seconds. I use a little script like this to watch the behavior of TIMELEFT and BCHARGE over time:
while [ 1 ] ; do
/sbin/apcaccess status | egrep TIMELEFT\|BCHARGE ;
sleep 1 ;
done
You can throw all that on one command line; I just wrapped it for the sake of fitting it in the email margins. It will give you an update of those variables each second, which is a good way to spot trends.
|
So, I need to calibrate it... I wonder if not letting it charge for the full eight hours before I started using it had anything to do with it not calibrating properly... I think I'll let it charge for eight before I plug everything back in. _________________ Join the adopt an unanswered post initiative today
Join the EFF!
Join the Drug Policy Alliance! |
|
Back to top |
|
|
whiskeypriest Tux's lil' helper
Joined: 05 Feb 2004 Posts: 91
|
Posted: Fri Oct 29, 2004 2:12 pm Post subject: |
|
|
Interesting.
I vaguely remember reading something about proper calibration when I was first working with apcupsd, but I never thought it would lead to that wide of a discrepancy between reported and actual runtime. Then again, I don't own an RS 1500, and I did let all my units charge for a minimum of eight hours (no load) before pressing them into service.
I'll be curious to see how much of a difference this procedure makes; hopefully it'll resolve the issue for you, and thanks for the update. |
|
Back to top |
|
|
ectospasm l33t
Joined: 19 Feb 2003 Posts: 711 Location: Mobile, AL, USA
|
Posted: Sat Oct 30, 2004 6:59 pm Post subject: |
|
|
This procedure made a HUGE difference. Here's the log of the last test:
Code: | Sat Oct 30 12:15:49 CDT 2004 Power failure.
Sat Oct 30 12:15:55 CDT 2004 Running on UPS batteries.
Sat Oct 30 13:30:31 CDT 2004 Battery power exhausted.
Sat Oct 30 13:30:31 CDT 2004 Initiating system shutdown!
Sat Oct 30 13:30:31 CDT 2004 User logins prohibited
Sat Oct 30 13:30:31 CDT 2004 Running on UPS batteries.
Sat Oct 30 13:30:31 CDT 2004 BCHARGE : 009.0 Percent
Sat Oct 30 13:30:31 CDT 2004 TIMELEFT : 4.3 Minutes
Sat Oct 30 13:30:34 CDT 2004 apcupsd exiting, signal 15
Sat Oct 30 13:30:34 CDT 2004 apcupsd shutdown succeeded
|
Now the TIMELEFT is not 0.0, which is really good. I'm very happy with this, but I'm still a bit concerned that both values were still above threshold(BATTERYLEVEL == 5, MINUTES == 3, in apcupsd.conf).
One thing that I did notice was that 10 hrs of loadless charging did not give me a 100% charge. Doesn't seem to have hurt the calibration at all.
I'm considering my problem solved. You might want to put in the HOWTO the importance of calibrating, at least for the RS 1500... _________________ Join the adopt an unanswered post initiative today
Join the EFF!
Join the Drug Policy Alliance! |
|
Back to top |
|
|
whiskeypriest Tux's lil' helper
Joined: 05 Feb 2004 Posts: 91
|
Posted: Sat Oct 30, 2004 9:09 pm Post subject: |
|
|
Glad to hear it!
I'll make a note of the calibration issue in the main document within the next few hours...meanwhile, I hope it continues to run smoothly for you.
Also, thanks for all you've done to ferret this issue out and get it resolved. |
|
Back to top |
|
|
whiskeypriest Tux's lil' helper
Joined: 05 Feb 2004 Posts: 91
|
Posted: Fri Dec 31, 2004 9:52 pm Post subject: |
|
|
Attention suhlhorn (and anyone else playing along at home):
mark_lagace has posted (what seems to be) a viable workaround for the UPS killpower issue that was discussed here back in September. Please have a look.
If there are any issues with this, feel free to discuss them here. Thanks again, Mark. |
|
Back to top |
|
|
TheKat n00b
Joined: 24 Jan 2004 Posts: 49
|
Posted: Wed Jan 05, 2005 3:58 pm Post subject: The gotchas I experienced |
|
|
I just got acpupsd working on my system (sys-apps/apcupsd-3.10.15-r1 with sys-kernel/nitro-sources-2.6.9-r4). Here are the gotchas I experienced.
My hardware setup (pre UPS) is as follows:
- 2 USB ports on back of system.
- One port occupied by USB hub built into keyboard (keyboard is PS2 keyboard with a USB hub in it)
- One port on keyboard occupied by USB trackball.
With this configuration, whenever I connected the UPS USB cable, the hub would shut down, disconnecting my mouse. After some testing I determined it was definitly the HUB. If I connected the trackball directly to the port it didn't happen. Through multiple reboots (warm and cold) and re-insertions, this kept happening.
I booted Windows to check the behaviour there. Everything worked fine. I booted Linux again. Everything worked fine. It continues to work now. I have no idea what happened...
Once the USB problem was resolved, I continued with the HOWTO.
- During the Communication Test phase, apcupsd did not log anything when the USB cable was disconnected. However, when the cable was reconnected, it did log the reconnect event. I twiddled kernel modules, recompiled using the HOWTO module/builtin layout, and no change. After some pondering, I decided to go ahead with the other tests.
- When I rebooted the system with apcupsd in the default runlevel, the network failed to start. After some thought I realized that apcupsd was starting up before the coldplugging of PCI slots, so the network was not yet initialized. This was interfereing with overall network initialization. Apparently the network start scripts would run before the NIC module was present. I manually loaded my NIC module in the module autoload (/etc/modules.autoload.d/kernel-2.6), and everything started working great.
I'm not sure if the network issue is a bug with the apcupsd package, or a more general bug with the Gentoo startup mechanism. Normally the NIC driver is loaded by coldplug, which would also start the network, but as apcupsd started up before coldplug, things got loaded out of 'normal' sequence.
I don't have an good answer to that one, but everything is now working for me. |
|
Back to top |
|
|
whiskeypriest Tux's lil' helper
Joined: 05 Feb 2004 Posts: 91
|
Posted: Thu Jan 06, 2005 4:39 pm Post subject: |
|
|
Thanks for posting.
All I can offer on the subject of your USB troubles is a congratulations that its working now. The vagaries of USB hubs and peripherals are legion, and my advice is usually limited to a reiteration of APCs recommendation that the UPS not be plugged into a hub. Since it sounds as though you were following that advice to begin with, Ill reiterate my previous congratulations and move on.
The issue with your NIC is something I hadnt heard of or encountered before; again, glad its working now. For the record, I avoid dealing with modules unless I have no other choice -- hence, my solution to that problem would have been a bit different.
My first stop would probably have been the /etc/init.d/apcupsd file:
Code: | #!/sbin/runscript
# Copyright 1999-2002 Gentoo Technologies, Inc.
# Distributed under the terms of the GNU General Public License, v2 or later
# $Header: /cvsroot/apcupsd/apcupsd/platforms/gentoo/apcupsd.in,v 1.1 2002/09/14 12:03:18 rfacchetti Exp $
APCPID=/var/run/apcupsd.pid
APCUPSD=/usr/sbin/apcupsd
depend() {
after hotplug
after usb
after net
}
start() {
rm -f /etc/apcupsd/powerfail
ebegin "Starting APC UPS daemon"
start-stop-daemon --start --quiet --exec $APCUPSD -- 1>&2
eend $?
}
stop() {
ebegin "Stopping APC UPS daemon"
start-stop-daemon --stop --quiet --pidfile $APCPID
eend $?
} |
Since I believe this version of the script predates the hotplug-coldplug changeover, I might have tried adding...
...to apcupsds dependencies. Youre welcome to give this a shot and let me know if it proves to be a more flexible solution.
Then again, theres also the aint-broke-dont-fix school of thought...if youre happy with it, outstanding.
Thanks again for posting, and I hope it continues to run well for you. |
|
Back to top |
|
|
ectospasm l33t
Joined: 19 Feb 2003 Posts: 711 Location: Mobile, AL, USA
|
Posted: Fri Jan 28, 2005 6:05 pm Post subject: |
|
|
I'm having problems with mark_lagace's suggestion for shutting down the UPS and having it restart the computer when power comes back. I'm using apcupsd 3.10.16 (this may be a source of the problem, since the method might be different for that version).
When I disconnect the power to run a powerfail test (with TIMEOUT set to 30 to speed up the test), the computer begins to shutdown, but is not fully shutdown. It just hangs with a blank screen, presumably at a point where shutdown is safe. I made sure to wait longer than the two minute sleep in /etc/init.d/halt.sh.
Here's the tail of my /etc/init.d/halt.sh:
Code: |
# Attempting to add apcupsd's killpower function here
if [ -f /etc/apcupsd/powerfail ]; then
ewarn "Power failure - shutting of UPS"
/etc/apcupsd/apccontrol killpower
sleep 120,
exit 1
fi
|
I'm suspecting that "exit 1" could be causing a problem, since the halt script is exiting with a non-zero return. I'm going to do a test with a normal halt (or shutdown -h) command, to see if the computer is shutdown properly that way.
[edit]I seems there is a difference between the "halt" and "shutdown -h now" commands (I thought "halt", when called from the command line, simply calls "shutdown -h now"--I was wrong). When I issue the "shutdown -h now" command, my computer shutsdown and powers off gracefully. If I issue the "halt" command, it seems to hang the way it did when testing the --killpower functionality above. _________________ Join the adopt an unanswered post initiative today
Join the EFF!
Join the Drug Policy Alliance! |
|
Back to top |
|
|
whiskeypriest Tux's lil' helper
Joined: 05 Feb 2004 Posts: 91
|
Posted: Tue Feb 01, 2005 1:18 am Post subject: |
|
|
Have you had any luck troubleshooting this?
For what it's worth, my system behaves in a similar fashion: with the X-server running at power failure, the screen images will "freeze" in place and remain until the system loses power (due to the UPS unit itself shutting down). From a straight console (no X-server) I get to see all the usual messages, including apcupsd communicating with the UPS through the killpower procedure...after which it hangs there (again, until the UPS powers itself down). All logs viewed on restoring power indicate filesystems cleanly unmounted, processes terminated, etc. -- it just doesn't look real pretty. This is with version 3.10.15-r1, doing both abbreviated and full power failures.
Does your UPS successfully power itself off using mark_lagace's procedures?
The other thing I'm curious about is this: it was my understanding that 3.10.16 was supposed to have this killpower functionality enabled for USB UPS units. Have you tried testing for this functionality as-is, rather than using mark_lagace's suggestion? |
|
Back to top |
|
|
mmarkin n00b
Joined: 13 Dec 2003 Posts: 43
|
Posted: Mon Mar 14, 2005 3:42 pm Post subject: |
|
|
Hey everyone.
I have a weird situation with the multimon.cgi script running via Apache. The script works just fine when all machines specified in hosts.conf are up and running. However, when one (or more) of the listed machines is down, the script has a bad tendency to sit there and wait for quite a long time before deciding that a particular machine is unreachable. Does anyone know of a place where I can tweak the time-out value?
Thanks
Mikhail |
|
Back to top |
|
|
whiskeypriest Tux's lil' helper
Joined: 05 Feb 2004 Posts: 91
|
Posted: Sun Mar 20, 2005 8:13 pm Post subject: |
|
|
Sorry for the delay in replying; been on the road the past week.
I'm also sorry to say that I haven't the foggiest idea where to begin with your problem -- I have a central "monitoring box" which keeps track of my four UPS-units, and have never experienced the behavior you describe. The machines being monitored are supported by UPS-units of various types in both standalone and master/slave configurations, and all update their status without issue.
I had a look around the apcupsd-users list, and this was the closest thing I could come up with...which doesn't really sound like your issue at all, but may be worth a look.
If anyone else has any ideas, please feel free to hold forth; Apache's not my strong-suit, and its interface with apcupsd has always "just worked" for me. Again, sorry I couldn't be of more help...please keep us posted if you find a solution. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|