Increasing occurrence of "BBU disabled" messages

15,180

Solution 1

Well, the official Intel RAID Smart Battery AXXRSBBU3 technical product specification says “Intel recommends replacing the battery yearly”, so the battery getting bad after an year is possible (especially if the battery is really older and was sitting on the shelf for some time before assembling the server — Li-Ion batteries lose capacity with time even when they are not in use).

You can try to get more information about the battery state: download the Command Line Tool appropriate for your OS from Intel Download Center, then run the following command:

CmdTool2 -AdpBbuCmd –aALL

It should output lots of information about the battery state (however, the detail level probably depends on the controller model). One thing you should check, in addition to obvious things like date of manufacture and “Full Charge Capacity” (measured during battery relearn cycles) compared to “Design Capacity” (which the brand new battery should have), is the battery temperature — although the specified operating range is up to 45°C, running at the temperature close to that maximum greatly shortens the battery lifetime.

You may also be able to obtain at least parts of the detailed battery information from GUI management utilities you might have already installed on the server.

Solution 2

Doesn't sound real healthy. The monthly battery re-learnings are probably OK (although a server slaughtering performance at inopportune moments isn't a real win) but if it's doing it more often, that suggests that the battery is getting flaky.

Share:
15,180

Related videos on Youtube

Erwin Blonk
Author by

Erwin Blonk

Updated on September 18, 2022

Comments

  • Erwin Blonk
    Erwin Blonk almost 2 years

    Until recent my Intel RAID controller (SROMBSASMR) had a monthly occurrence of "BBU disabled: changing WB logical drives to WT", followed about 2 1/2 hours later by "Battery relearn complete".

    Since a little over 2 weeks "BBU disabled" started appearing outside of this cycle in a steady pattern each 2 or 3 days*

    I'm wondering what this means. Should I replace the battery? Is the controller about to fail?

    For the record: I do know what the BBU disabled and relearn messages in themselves mean.

    *to be precise 3 times spaced apart by 2 and 3 days, this cycle turn repeated every 8 days. I expect the next occurrence tomorrow in the early afternoon, roughly 2PM.

    • the-wabbit
      the-wabbit almost 12 years
      when does it re-enable? Does it re-enable automatically at all?
    • the-wabbit
      the-wabbit almost 12 years
      Oh, and take a look at the Intel RAID Web Console and check the AutoLearn Period and charge state values.
  • Erwin Blonk
    Erwin Blonk almost 12 years
    It is where my thoughts are going. The server and all the hardware is a little over a year old and is in production for 9 months. I have no idea about the lifespan of the battery and it could have been non-optimal to begin with. The thing is that this server cannot, for practical reasons, easily be scheduled for downtime. Replacing a battery just to be sure isn't an option, so I want to be as sure as possible this is it.
  • Mat
    Mat almost 12 years
    @EBV2010: what do you prefer: planned downtime with no data loss (but a remote possibility that the problem is not fixed), or unplanned downtime with potential data loss?
  • womble
    womble almost 12 years
    @Mat: It's only a risk of unplanned downtime, but a guaranteed planned downtime. <grin>
  • the-wabbit
    the-wabbit almost 12 years
    @Mat it also is not a question of downtime with data loss but only of reduced I/O performance due to disabled write-back caches.
  • Erwin Blonk
    Erwin Blonk almost 12 years
    All signs seems to be green, CmdTool2 basically says everything is ok, fully charged, 97% capacity. The manufacture date says 1/10/2011 but that could mean Jan. 10th or Oct. 1st. Seeing the time this server has been in around (before my time) I say Jan. 10th. Still, I'll keep an eye on it and schedule downtime at some point to deal with some other issues as well.