Friday, 3 January 2014

FMADM FAULTY FOR PROCESSOR

TEST # fmadm faulty
--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Feb 20 14:41:13 350e799e-27c7-6621-d897-dbab2fbe0efb  SUN4U-8001-0J  Major
Host        : p10app04
Platform    : SUNW,Sun-Fire-V490        Chassis_id  :
Product_sn  :
Fault class : fault.cpu.ultraSPARC-IVplus.l3cachedata
Affects     : cpu:///cpuid=2/serial=80020228CE6C1347
              cpu:///cpuid=18/serial=80020228CE6C1347
                  faulted but still in service
FRU         : "Slot A" (hc://:product-id=SUNW,Sun-Fire-V490:server-id=p10app04/c
                  faulty
Description : The number of errors associated with this CPU has exceeded
              acceptable levels.  Refer to http://sun.com/msg/SUN4U-8001-0J for
              more information.
Response    : The fault manager will attempt to remove the affected CPU from
              service.
Impact      : System performance may be affected.
Action      : Schedule a repair procedure to replace the affected CPU, the
              identity of which can be determined using fmdump -v -u
              <EVENT_ID>.


TEST # fmdump -v -u 350e799e-27c7-6621-d897-dbab2fbe0efb
TIME                 UUID                                 SUNW-MSG-ID
Feb 20 14:41:14.2577 350e799e-27c7-6621-d897-dbab2fbe0efb SUN4U-8001-0J
  100%  fault.cpu.ultraSPARC-IVplus.l3cachedata
        Problem in: -
           Affects: cpu:///cpuid=2/serial=80020228CE6C1347
               FRU: hc://:product-id=SUNW,Sun-Fire-V490:server-id=p10app04/component=Slot A
          Location: -
  100%  fault.cpu.ultraSPARC-IVplus.l3cachedata
        Problem in: -
           Affects: cpu:///cpuid=18/serial=80020228CE6C1347
               FRU: hc://:product-id=SUNW,Sun-Fire-V490:server-id=p10app04/component=Slot A
          Location: -

TEST # psrinfo
0       on-line   since 02/20/2013 14:40:34
1       on-line   since 02/20/2013 14:40:34
2       faulted   since 02/20/2013 14:41:14
3       on-line   since 02/20/2013 14:40:32
16      on-line   since 02/20/2013 14:40:34
17      on-line   since 02/20/2013 14:40:34
18      faulted   since 02/20/2013 14:41:14
19      on-line   since 02/20/2013 14:40:34

TEST # psrinfo -p
4
TEST # psrinfo |wc -l
       8
TEST # psradm -f 2
psradm: processor 2 in faulted state; add -F option to force change

TEST # psradm -f -F 2
TEST # psradm -f -F 18
TEST # psrinfo
0       on-line   since 02/20/2013 14:40:34
1       on-line   since 02/20/2013 14:40:34
2       off-line  since 02/20/2013 19:54:54
3       on-line   since 02/20/2013 14:40:32
16      on-line   since 02/20/2013 14:40:34
17      on-line   since 02/20/2013 14:40:34
18      off-line  since 02/20/2013 19:55:18
19      on-line   since 02/20/2013 14:40:34
TEST # fmadm repaired 350e799e-27c7-6621-d897-dbab2fbe0efb
fmadm: failed to record repair to 350e799e-27c7-6621-d897-dbab2fbe0efb: specified resource is not known to be faulty
TEST # >errlog
TEST # >fltlog
TEST # cd rsrc
TEST # ls
479acae3-52ce-41be-92fb-ae8517bf4657  c784c664-1ea7-c332-814d-f76e939c2db3
TEST # file *
479acae3-52ce-41be-92fb-ae8517bf4657:  extended accounting file
c784c664-1ea7-c332-814d-f76e939c2db3:  extended accounting file
TEST # rm *
TEST # pwd
/var/fm/fmd/rsrc
TEST # cd ..
TEST # ls
errlog  fltlog  rsrc    xprt
TEST # svcadm restart fmd
TEST # fmadm faulty
TEST # psrinfo
0    on-line   since 02/20/2013 14:40:34
1    on-line   since 02/20/2013 14:40:34
2    on-line   since 02/20/2013 19:59:36
3    on-line   since 02/20/2013 14:40:32
16   on-line   since 02/20/2013 14:40:34
17   on-line   since 02/20/2013 14:40:34
18   on-line   since 02/20/2013 19:59:54
19   on-line   since 02/20/2013 14:40:34

No comments:

Post a Comment