OT linux system admins

I am running CentOS 5.3 ..

I am trying to figure out why my system restarts it self on random days..

heres something from the messages log

Sep  6 04:03:02 GEODEV syslogd 1.4.1: restart.
Sep  6 10:15:55 GEODEV dhclient: DHCPREQUEST on eth0 to 192.168.1.1 port 67
Sep  6 10:15:55 GEODEV dhclient: DHCPACK from 192.168.1.1
Sep  6 10:15:55 GEODEV dhclient: bound to 192.168.1.121 -- renewal in 247577 seconds.
Sep  7 03:16:15 GEODEV avahi-daemon[2485]: Invalid query packet.
Sep  7 03:40:04 GEODEV last message repeated 3 times
Sep  7 06:41:47 GEODEV last message repeated 3 times
Sep  7 08:51:33 GEODEV last message repeated 3 times
Sep  7 09:22:38 GEODEV last message repeated 3 times
Sep  7 10:44:46 GEODEV last message repeated 3 times
Sep  7 11:20:51 GEODEV last message repeated 3 times
Sep  7 12:19:49 GEODEV last message repeated 3 times
Sep  8 01:23:11 GEODEV last message repeated 3 times
Sep  8 02:39:42 GEODEV last message repeated 3 times
Sep  8 03:09:47 GEODEV last message repeated 3 times
Sep  8 03:11:14 GEODEV last message repeated 3 times
Sep  8 03:41:27 GEODEV last message repeated 3 times
Sep  8 04:11:32 GEODEV last message repeated 3 times
Sep  8 04:41:37 GEODEV last message repeated 3 times
Sep  8 05:11:42 GEODEV last message repeated 2 times
Sep  8 05:19:07 GEODEV last message repeated 3 times
Sep  8 05:32:54 GEODEV last message repeated 3 times
Sep  8 06:11:22 GEODEV last message repeated 3 times
Sep  8 07:09:05 GEODEV last message repeated 3 times
Sep  8 07:36:13 GEODEV last message repeated 3 times
Sep  8 07:41:03 GEODEV last message repeated 3 times
Sep  8 08:13:21 GEODEV last message repeated 3 times
Sep  8 08:43:31 GEODEV last message repeated 3 times
Sep  8 09:10:23 GEODEV last message repeated 3 times
Sep  8 09:25:38 GEODEV last message repeated 3 times
Sep  8 10:58:03 GEODEV last message repeated 3 times
Sep  8 11:21:12 GEODEV last message repeated 3 times
Sep  8 11:49:26 GEODEV last message repeated 3 times
Sep  8 12:03:14 GEODEV last message repeated 3 times
Sep  8 12:03:14 GEODEV last message repeated 2 times
Sep  9 07:02:12 GEODEV dhclient: DHCPREQUEST on eth0 to 192.168.1.1 port 67
Sep  9 07:02:12 GEODEV dhclient: DHCPACK from 192.168.1.1
Sep  9 07:02:12 GEODEV dhclient: bound to 192.168.1.121 -- renewal in 289088 seconds.
Sep 10 21:04:10 GEODEV syslogd 1.4.1: restart.
Sep 10 21:04:10 GEODEV kernel: klogd 1.4.1, log source = /proc/kmsg started.
Sep 10 21:04:10 GEODEV kernel: Linux version 2.6.18-128.el5 () (gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)) #1 SMP Wed Jan 21 10:44:23 EST 2009

why do I see these?

Sep 10 21:04:10 GEODEV syslogd 1.4.1: restart.

I know for a fact no one has restarted this server, its restarting by itself. Its not overheating. Does ti restart when it fails to retrieve a packet or something?

Nothing suspicious in there imo. You could try disabling thee avahi shit just for fun, unless you use it

Server is rotating your logs. Upon rotation syslogd is restarted. Nothing to worry about.

man logrotate

Yeah, a real system restart has a lot more shit in it.. hardware detection, all your other services starting, etc.

DHCP timeout and refresh is normal too.

well I left the hardware detection out, I figure it wasn’t needed. Does CentOS do automatic system updates?

Depends on how you have your system set up. By default, no *nix system will update anything automatically.

Use yum.

Where would look for failure.. I tried opening the faillog, btu I get a bunch of "@" symbols..

check your uptime and see if it actually is rebooting or if the log is just rotating. It is a strange time for logs to rotate though.

centos doesn’t reboot on its own no matter what you have going on unless you tell it to. from the command line type

last

and see how many times its actually rebooting. type dmesg | more and look thru that stuff and see if you have any start up errors

Last login: Tue Sep 15 08:14:36 2009 from ge0laptop
[ge0@GEODEV ~]$ last
ge0   pts/1        ge0laptop     Tue Sep 15 19:38   still logged in
ge0   pts/1        ge0laptop     Tue Sep 15 08:14 - 09:10  (00:56)
ge0   pts/1        ge0laptop     Mon Sep 14 09:10 - 09:39  (00:29)
ge0   :0                            Thu Sep 10 21:07   still logged in
ge0   :0                            Thu Sep 10 21:07 - 21:07  (00:00)
reboot   system boot  2.6.18-128.el5   Thu Sep 10 21:04         (4+22:34)
ge0   pts/2        ge0laptop     Thu Sep 10 08:52 - 15:09  (06:16)
ge0   pts/2        ge0laptop     Tue Sep  8 20:01 - 20:02  (00:01)
ge0   pts/2        ge0laptop     Mon Sep  7 13:18 - 15:30  (02:11)
ge0   pts/2        geomachine       Mon Sep  7 10:29 - 10:29  (00:00)
ge0   pts/2        geomachine       Sun Sep  6 12:08 - 12:09  (00:00)
ge0   pts/2        :0.0             Sat Sep  5 08:11 - 08:11  (00:00)
ge0   pts/2        geomachine       Fri Sep  4 10:06 - 10:08  (00:01)
ge0   pts/1        :0.0             Thu Sep  3 12:00 - crash (7+09:04)
ge0   :0                            Thu Sep  3 11:54 - crash (7+09:09)
ge0   :0                            Thu Sep  3 11:54 - 11:54  (00:00)
reboot   system boot  2.6.18-128.el5   Thu Sep  3 11:50         (12+07:47)
ge0   pts/1        geomachine       Sun Aug 16 17:28 - down   (00:30)
ge0   :0                            Sun Aug 16 17:24 - down   (00:34)
ge0   :0                            Sun Aug 16 17:24 - 17:24  (00:00)
reboot   system boot  2.6.18-128.el5   Sun Aug 16 17:20          (00:38)
ge0   pts/1        192.168.1.139    Sun Aug 16 10:08 - crash  (07:12)
ge0   pts/1        192.168.1.139    Sat Aug 15 11:54 - 12:04  (00:10)
ge0   :0                            Sat Aug 15 11:53 - crash (1+05:27)
ge0   :0                            Sat Aug 15 11:53 - 11:53  (00:00)
reboot   system boot  2.6.18-128.el5   Sat Aug 15 11:53         (1+06:05)
ge0   pts/2        192.168.1.139    Sat Aug 15 12:03 - down   (00:58)
ge0   pts/1        192.168.1.139    Sat Aug 15 11:41 - down   (01:20)
ge0   :0                            Sat Aug 15 11:02 - down   (01:59)
ge0   :0                            Sat Aug 15 11:02 - 11:02  (00:00)
reboot   system boot  2.6.18-128.el5   Sat Aug 15 11:01          (02:00)
ge0   pts/1        192.168.1.139    Thu Aug 13 21:52 - crash (1+13:08)
ge0   :0                            Thu Aug 13 19:19 - crash (1+15:42)
ge0   :0                            Thu Aug 13 19:19 - 19:19  (00:00)
reboot   system boot  2.6.18-128.el5   Thu Aug 13 19:18         (1+17:42)
ge0   pts/1        192.168.1.139    Wed Aug 12 23:24 - 00:05  (00:40)
ge0   :0                            Wed Aug 12 23:20 - crash  (19:58)
ge0   :0                            Wed Aug 12 23:20 - 23:20  (00:00)
reboot   system boot  2.6.18-128.el5   Wed Aug 12 23:19         (2+13:41)
ge0   pts/2        192.168.1.139    Wed Aug 12 23:12 - down   (00:06)
ge0   pts/1        192.168.1.139    Wed Aug 12 22:52 - down   (00:25)
ge0   pts/1        192.168.1.139    Wed Aug 12 22:15 - 22:52  (00:37)
ge0   pts/1        192.168.1.139    Wed Aug 12 20:10 - 20:32  (00:22)
ge0   :0                            Wed Aug 12 20:07 - down   (03:11)
ge0   :0                            Wed Aug 12 20:07 - 20:07  (00:00)
reboot   system boot  2.6.18-128.el5   Wed Aug 12 20:06          (03:11)
ge0   pts/1        192.168.1.139    Mon Aug 10 20:23 - 20:23  (00:00)
ge0   pts/1        192.168.1.139    Sat Aug  8 15:20 - 18:31  (03:11)
ge0   :0                            Sat Aug  8 14:57 - crash (4+05:09)
ge0   :0                            Sat Aug  8 14:57 - 14:57  (00:00)
reboot   system boot  2.6.18-128.el5   Sat Aug  8 14:57         (4+08:21)
ge0   pts/1        192.168.1.139    Wed Aug  5 20:06 - crash (2+18:50)
ge0   :0                            Tue Aug  4 19:40 - crash (3+19:16)
ge0   :0                            Tue Aug  4 19:40 - 19:40  (00:00)
reboot   system boot  2.6.18-128.el5   Tue Aug  4 19:39         (8+03:38)
ge0   pts/2        192.168.1.129    Tue Aug  4 19:32 - 19:38  (00:06)
ge0   pts/2        192.168.1.129    Tue Aug  4 19:23 - 19:24  (00:00)
ge0   pts/1        192.168.1.139    Mon Aug  3 00:32 - down  (1+19:05)
ge0   :0                            Mon Aug  3 00:28 - down  (1+19:09)
ge0   :0                            Mon Aug  3 00:28 - 00:28  (00:00)
reboot   system boot  2.6.18-128.el5   Mon Aug  3 00:28         (1+19:10)
ge0   pts/1        192.168.1.139    Sun Aug  2 15:38 - down   (00:02)
ge0   pts/2        :0.0             Sun Aug  2 15:36 - 15:38  (00:01)
ge0   pts/1        :0.0             Sat Aug  1 22:28 - 15:38  (17:09)
ge0   :0                            Sat Aug  1 21:10 - down   (18:30)
ge0   :0                            Sat Aug  1 21:10 - 21:10  (00:00)
reboot   system boot  2.6.18-128.el5   Sat Aug  1 21:09          (18:31)
ge0   :0                            Sat Aug  1 19:39 - crash  (01:30)
ge0   :0                            Sat Aug  1 19:39 - 19:39  (00:00)
reboot   system boot  2.6.18-128.el5   Sat Aug  1 19:37          (20:03)
ge0   :0                            Sat Aug  1 19:18 - 19:22  (00:03)
ge0   :0                            Sat Aug  1 19:18 - 19:18  (00:00)
reboot   system boot  2.6.18-128.el5   Sat Aug  1 19:18          (00:03)
ge0   pts/1        192.168.1.139    Sat Aug  1 18:21 - down   (00:04)
ge0   :0                            Sat Aug  1 18:20 - down   (00:05)
ge0   :0                            Sat Aug  1 18:20 - 18:20  (00:00)
reboot   system boot  2.6.18-128.el5   Sat Aug  1 18:19          (00:05)
ge0   pts/1        :0.0             Sat Aug  1 18:11 - 18:18  (00:06)
ge0   :0                            Sat Aug  1 18:06 - 18:18  (00:11)
ge0   :0                            Sat Aug  1 18:06 - 18:06  (00:00)
reboot   system boot  2.6.18-128.el5   Sat Aug  1 18:04          (00:13)
ge0   pts/1        :0.0             Sat Aug  1 17:51 - 17:54  (00:02)
ge0   :0                            Sat Aug  1 17:51 - 17:54  (00:03)
ge0   :0                            Sat Aug  1 17:51 - 17:51  (00:00)
reboot   system boot  2.6.18-128.el5   Sat Aug  1 17:50          (00:03)
ge0   pts/4        :2.0             Sat Aug  1 17:07 - 17:09  (00:01)
ge0   pts/3        :1.0             Sat Aug  1 17:04 - 17:09  (00:04)
ge0   pts/2        192.168.1.139    Sat Aug  1 16:58 - down   (00:11)
ge0   pts/1        :0.0             Sat Aug  1 16:48 - 17:09  (00:21)
ge0   pts/0        :0.0             Sat Aug  1 16:33 - 17:09  (00:35)
ge0   :0                            Sat Aug  1 16:09 - 17:09  (00:59)
ge0   :0                            Sat Aug  1 16:09 - 16:09  (00:00)
reboot   system boot  2.6.18-128.el5   Sat Aug  1 16:07          (01:01)
ge0   :0                            Sat Aug  1 16:04 - 16:05  (00:01)
ge0   :0                            Sat Aug  1 16:04 - 16:04  (00:00)
ge0   pts/1        :0.0             Sat Aug  1 16:01 - 16:03  (00:02)
ge0   :0                            Sat Aug  1 15:53 - 16:03  (00:10)
ge0   :0                            Sat Aug  1 15:53 - 15:53  (00:00)
reboot   system boot  2.6.18-128.el5   Sat Aug  1 15:52          (00:12)
reboot   system boot  2.6.18-128.el5   Sat Aug  1 15:41          (00:09)

I didn;t find anything suspicious in the dmesg

edit: read that wrong. This a server/desktop something you own (colo) or dedicated ?

Dell poweredge I own sitting in my basement.

If you can’t read "last" output, run ‘uptime’ and it’ll tell you exactly how long your system has been running WITHOUT A REBOOT OR CRASH!

[supergeek@jupiter ~]$ uptime
11:26:39 up 50 days, 19:33, 1 user, load average: 0.00, 0.00, 0.00

[root@GEODEV samba]# uptime
 07:45:22 up 6 days, 10:42,  2 users,  load average: 0.00, 0.00, 0.00

george   pts/1        geomanlaptop     Sat Sep 19 07:13   still logged in
george   :0                            Fri Sep 18 22:21   still logged in
george   :0                            Fri Sep 18 22:21 - 22:21  (00:00)
reboot   system boot  2.6.18-128.el5   Fri Sep 18 22:18          (10:15)
george   pts/1        geomanlaptop     Wed Sep 16 08:39 - crash (2+13:38)

why would it restart itself tho after a week?

shitty power? centos doesnt just restart on its own.

heres what werid, My other poweredge has been going strong for 2 months without a restart… PLugged into the same outlet

werid, I will look into that power problem some more.

You can probably change some syslog settings and make the logs more verbose…

This, forgot about that. Your system is restarting on its own for some other reason probably hardware or someone else has access to your system and exploiting it. Unlike windows, linux won’t restart on its own without a manual restart or a power trip.

I would setup your logging situation to get a more defined output ie higher debugging level. If it helps any move your old logs off somewhere and create new ones so you can get a fresh start on logs. Rule out hardware failure using standard diagnostic testing (memory and drive testing/benckmark) Power supplies can be tested to but you need a testing tool for this to see if it is putting out the right voltage.

Process of elimination at this point. Nothing comes to my mind what would cause it except an electrical issue.

This, forgot about that. Your system is restarting on its own for some other reason probably hardware or someone else has access to your system and exploiting it. Unlike windows, linux won’t restart on its own without a manual restart or a power trip.

I would setup your logging situation to get a more defined output ie higher debugging level. If it helps any move your old logs off somewhere and create new ones so you can get a fresh start on logs. Rule out hardware failure using standard diagnostic testing (memory and drive testing/benckmark) Power supplies can be tested to but you need a testing tool for this to see if it is putting out the right voltage.

Process of elimination at this point. Nothing comes to my mind what would cause it except an electrical issue.

Electrical it was. Had them on an older powersurge thing.. took it off that and its been fine for a month..

Thanks Guys

some dude at your hosting company is fucking with you

capture the output of dmesg as well

# dmesg > sometextfileillreadlater.txt

or

# dmesg | less

and page through it

it frequently will have messages if you’re having hardware problems

I am the hosting company.. I got few poweredges running in my basement.

So you’re fucking with yourself? Or your cat/dog/spouse?

pretty much