Thursday, April 24th 2014, 5:40pm UTC+2

You are not logged in.

  • Login
  • Register

Dear visitor, welcome to Monitoring-Portal.
Although this is a german monitoring forum, please don't hesitate to post in English. Nearly everybody here understands you and will answer in English as well.
If this is your first visit here, please read the Help. It explains how this page works. You must be registered before you can use all the page's features. Please use the registration form to register here or read more information about the registration process. If you are already registered, please login here.

stiftmaster

Beginner

Posts: 40

Location: Kassel

Occupation: Sysadmin

Number of monitoring servers: 1

Nagios Version: none

Icinga Version: Icinga 1.9.dev

Distributed monitoring: Nein

Redundant monitoring: Nein

Number of hosts: 120

Number of services: 2204

OS: Debian (wheezy) 7

Plugin Version: 1.4.16-1

Other Addons: SNMPTT, PNP4Nagios 0.6.19, sendsms+GSM Modul

1

Tuesday, July 17th 2012, 12:54pm

[SOLVED] illegal attempt to update using time

Hallo Monitorer,
ich habe nun das gesamte Internet nach einer Lösung durchsucht, jedoch nicht wirklich was gefunden.

PNP funktionierte wunderbar, bis ich letzte Woche auf Icinga 1.7.1 geupdatet hab.

perfdata.log

Source code

1
RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/azs-dt100-02/Network_Ping.rrd: illegal attempt to update using time 1342521488 when last update time is 1342521488 (minimum one second step)


Soll ja glaub ich bedeuten, 2 Checks zur gleichen Zeit.

Von diesen Kandidaten gibt es immer so 1-2 pro Minute, mal Ping mal CPU Load.

Nur habe ich nichts geändert, bis auf das Icinga Update. Aus Verzweiflung habe ich dann noch RRD, sowie PNP aktualisiert.

Hier das Ergebnis vom Verify-Script

Source code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
[INFO]  ========== Starting Environment Checks ============
[INFO]  My version is: pnp4nagios-head
[INFO]  Reading /usr/local/icinga/etc/icinga.cfg
[OK  ]  Running product is 'icinga'
[OK  ]  object_cache_file is defined
[OK  ]  object_cache_file=/usr/local/icinga/var/objects.cache
[INFO]  Reading /usr/local/icinga/var/objects.cache
[OK  ]  resource_file is defined
[OK  ]  resource_file=/usr/local/icinga/etc/resource.cfg
[INFO]  Reading /usr/local/icinga/etc/resource.cfg
[INFO]  Reading /usr/local/pnp4nagios/etc//process_perfdata.cfg
[INFO]  Reading /usr/local/pnp4nagios/etc//pnp4nagios_release
[OK  ]  Found PNP4Nagios version "0.6.18"
[OK  ]  ./configure Options '--with-nagios-user=icinga' '--with-nagios-group=icinga' '--with-rrdtool=/opt/rrdtool-1.4.7/bin/rrdtool'
[OK  ]  Effective User is 'icinga'
[OK  ]  User icinga exists with ID '1001'
[OK  ]  Effective group is 'icinga'
[OK  ]  Group icinga exists with ID '1002'
[INFO]  ========== Checking Bulk Mode + NPCD Config  ============
[OK  ]  process_performance_data is 1 compared with '/1/'
[OK  ]  service_perfdata_file is defined
[OK  ]  service_perfdata_file=/usr/local/pnp4nagios/var/service-perfdata
[OK  ]  service_perfdata_file_template is defined
[OK  ]  service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTATETYPE$
[OK  ]  PERFDATA template looks good
[OK  ]  service_perfdata_file_mode is defined
[OK  ]  service_perfdata_file_mode=a
[OK  ]  service_perfdata_file_processing_interval is defined
[OK  ]  service_perfdata_file_processing_interval=15
[OK  ]  service_perfdata_file_processing_command is defined
[OK  ]  service_perfdata_file_processing_command=process-service-perfdata-file
[OK  ]  host_perfdata_file is defined
[OK  ]  host_perfdata_file=/usr/local/pnp4nagios/var/host-perfdata
[OK  ]  host_perfdata_file_template is defined
[OK  ]  host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$
[OK  ]  PERFDATA template looks good
[OK  ]  host_perfdata_file_mode is defined
[OK  ]  host_perfdata_file_mode=a
[OK  ]  host_perfdata_file_processing_interval is defined
[OK  ]  host_perfdata_file_processing_interval=15
[OK  ]  host_perfdata_file_processing_command is defined
[OK  ]  host_perfdata_file_processing_command=process-host-perfdata-file
[INFO]  Icinga config looks good so far
[INFO]  ========== Checking config values ============
[OK  ]  npcd daemon is running
[OK  ]  /usr/local/pnp4nagios/etc/npcd.cfg is used by npcd and readable
[INFO]  Reading /usr/local/pnp4nagios/etc/npcd.cfg
[OK  ]  perfdata_spool_dir is defined
[OK  ]  perfdata_spool_dir=/usr/local/pnp4nagios/var/spool/
[OK  ]  Command process-service-perfdata-file is defined
[OK  ]  '/bin/mv /usr/local/pnp4nagios/var/service-perfdata /usr/local/pnp4nagios/var/spool/service-perfdata.$TIMET$'
[OK  ]  Command looks good
[OK  ]  Command process-host-perfdata-file is defined
[OK  ]  '/bin/mv /usr/local/pnp4nagios/var/host-perfdata /usr/local/pnp4nagios/var/spool/host-perfdata.$TIMET$'
[OK  ]  Command looks good
[OK  ]  Script /usr/local/pnp4nagios/libexec/process_perfdata.pl is executable
[INFO]  ========== Starting global checks ============
[OK  ]  status_file is defined
[OK  ]  status_file=/usr/local/icinga/var/status.dat
[INFO]  Reading /usr/local/icinga/var/status.dat
[INFO]  ==== Starting rrdtool checks ====
[OK  ]  RRDTOOL is defined
[OK  ]  RRDTOOL=/opt/rrdtool-1.4.7/bin/rrdtool
[OK  ]  /opt/rrdtool-1.4.7/bin/rrdtool is executable
[OK  ]  RRDtool 1.4.7  Copyright 1997-2012 by Tobias Oetiker <tobi@oetiker.ch>
[OK  ]  USE_RRDs is defined
[OK  ]  USE_RRDs=1
[OK  ]  Perl RRDs modules are loadable
[INFO]  ==== Starting directory checks ====
[OK  ]  RRDPATH is defined
[OK  ]  RRDPATH=/usr/local/pnp4nagios/var/perfdata
[OK  ]  Perfdata directory '/usr/local/pnp4nagios/var/perfdata' exists
[WARN]  1298 hosts/services are not providing performance data
[WARN]  'process_perf_data 1' is set for 1299 hosts/services which are not providing performance data!
[OK  ]  'process_perf_data 1' is set for 2380 of your hosts/services
[INFO]  ==== System sizing ====
[OK  ]  2379 hosts/service objects defined
[INFO]  ==== Check statistics ====
[WARN]  Warning: 2, Critical: 0
[WARN]  Checks finished...


Gibt es Lösungsvorschläge?
./stiftmaster

This post has been edited 1 times, last edit by "stiftmaster" (Aug 29th 2012, 1:37pm)


dnsmichi

Super Moderator

Posts: 7,046

Birthday: May 30th 1983 (30)

Gender: male

Location: Nürnberg

Occupation: Application Developer beim besten Arbeitgeber der Welt @netways

Number of monitoring servers: Icinga: 2x dev, 10++ prod, Icinga2: 4x dev

Nagios Version: s/nagios/icinga/

Icinga Version: 1.11.x / 2 0.0.x / GIT next

Distributed monitoring: Ja

Redundant monitoring: Ja

Number of hosts: 1000+

Number of services: 15000+

OS: RHEL, Debian, SUSE

Plugin Version: 1.5

IDO-Version: 1.11.x / GIT MySQL/Postgresql

Other Addons: Icinga Web, PNP, check_multi, inGraph, EventDB, LConf

2

Tuesday, July 17th 2012, 1:36pm

wie sehen diese checks dann im (debug)log aus?
+++ Icinga / LConf Developer +++ Application Developer at []NETWAYS> +++ Blog +++
+++ Icinga 1.11 || Icinga 2 +++ Icinga Support +++

stiftmaster

Beginner

Posts: 40

Location: Kassel

Occupation: Sysadmin

Number of monitoring servers: 1

Nagios Version: none

Icinga Version: Icinga 1.9.dev

Distributed monitoring: Nein

Redundant monitoring: Nein

Number of hosts: 120

Number of services: 2204

OS: Debian (wheezy) 7

Plugin Version: 1.4.16-1

Other Addons: SNMPTT, PNP4Nagios 0.6.19, sendsms+GSM Modul

3

Tuesday, July 17th 2012, 3:12pm

icinga.debug

Source code

1
2
3
4
5
6
7
8
9
10
11
[1342529970.063387] [016.0] [pid=30139] ** Handling check result for service 'Processes -Total Processes' on host 'srv-vc01'...
[1342529970.063390] [016.1] [pid=30139] HOST: srv-vc01, SERVICE: Processes -Total Processes, CHECK TYPE: Active, OPTIONS: 0, SCHEDULED: Yes, RESCHEDULE: Yes, EXITED OK: Yes, RETURN CODE: 0, OUTPUT: Processes = 62,00 Processes | Processes=62,000000Processes;0,000000;0,000000;\n
[1342529970.063400] [016.1] [pid=30139] Service is OK.
[1342529970.063404] [016.1] [pid=30139] Service did not change state.
[1342529970.063410] [016.1] [pid=30139] Rescheduling next check of service at Tue Jul 17 15:00:27 2012
[1342529970.063417] [016.0] [pid=30139] Scheduling a non-forced, active check of service 'Processes -Total Processes' on host 'srv-vc01' @ Tue Jul 17 15:00:27 2012
[1342529970.063466] [016.1] [pid=30139] Checking service 'Processes -Total Processes' on host 'srv-vc01' for flapping...
[1342529970.063472] [016.1] [pid=30139] Service is not flapping (0.00% state change).
[1342529970.063476] [016.1] [pid=30139] Checking host 'srv-vc01' for flapping...
[1342529970.063479] [016.1] [pid=30139] Host is not flapping (0.00% state change).
[1342529970.063500] [016.1] [pid=30139] Deleted check result file '/usr/local/icinga/var/spool/checkresults/c6Eg2qz'


perfdata.log

Source code

1
2
2012-07-17 14:59:42 [30179] [0] RRDs::update /usr/local/pnp4nagios/var/perfdata/srv-vc01/Processes_-Total_Processes.rrd 1342529970:62.000000
2012-07-17 14:59:42 [30179] [0] RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/srv-vc01/Processes_-Total_Processes.rrd: illegal attempt to update using time 1342529970 when last update time is 1342529970 (minimum one second step)


Oder welches Log meinst Du

dnsmichi

Super Moderator

Posts: 7,046

Birthday: May 30th 1983 (30)

Gender: male

Location: Nürnberg

Occupation: Application Developer beim besten Arbeitgeber der Welt @netways

Number of monitoring servers: Icinga: 2x dev, 10++ prod, Icinga2: 4x dev

Nagios Version: s/nagios/icinga/

Icinga Version: 1.11.x / 2 0.0.x / GIT next

Distributed monitoring: Ja

Redundant monitoring: Ja

Number of hosts: 1000+

Number of services: 15000+

OS: RHEL, Debian, SUSE

Plugin Version: 1.5

IDO-Version: 1.11.x / GIT MySQL/Postgresql

Other Addons: Icinga Web, PNP, check_multi, inGraph, EventDB, LConf

4

Tuesday, July 17th 2012, 4:09pm

ich meinte icinga.debug, aber ueber die zeit gesehen - wenn du sagst, dass da checks innerhalb der selben sekunde stattfinden, dann moechte ich das in eingem groben interval sehen - sagen wir mal 5 bis 10 minuten. um ggf ein muster zu erkennen.
+++ Icinga / LConf Developer +++ Application Developer at []NETWAYS> +++ Blog +++
+++ Icinga 1.11 || Icinga 2 +++ Icinga Support +++

stiftmaster

Beginner

Posts: 40

Location: Kassel

Occupation: Sysadmin

Number of monitoring servers: 1

Nagios Version: none

Icinga Version: Icinga 1.9.dev

Distributed monitoring: Nein

Redundant monitoring: Nein

Number of hosts: 120

Number of services: 2204

OS: Debian (wheezy) 7

Plugin Version: 1.4.16-1

Other Addons: SNMPTT, PNP4Nagios 0.6.19, sendsms+GSM Modul

5

Tuesday, July 17th 2012, 4:47pm

Das sind die Kandidaten in diesem Zeitraum


[1342534530.076669] srv-icinga01/Network_Netstat_-_SSH_Connections
[1342534530.085126] pabx-101/Network_Ping
[1342534530.181864] srv-vc01/Processes_-Total_Processes
[1342534561.268499] srv-ns01/Memory_Free


editiert:
stiftmaster has attached the following file:
  • icinga.zip (99.03 kB - 75 times downloaded - Last download: Apr 17th 2014, 11:28pm)

This post has been edited 2 times, last edit by "stiftmaster" (Jul 17th 2012, 6:47pm)


dnsmichi

Super Moderator

Posts: 7,046

Birthday: May 30th 1983 (30)

Gender: male

Location: Nürnberg

Occupation: Application Developer beim besten Arbeitgeber der Welt @netways

Number of monitoring servers: Icinga: 2x dev, 10++ prod, Icinga2: 4x dev

Nagios Version: s/nagios/icinga/

Icinga Version: 1.11.x / 2 0.0.x / GIT next

Distributed monitoring: Ja

Redundant monitoring: Ja

Number of hosts: 1000+

Number of services: 15000+

OS: RHEL, Debian, SUSE

Plugin Version: 1.5

IDO-Version: 1.11.x / GIT MySQL/Postgresql

Other Addons: Icinga Web, PNP, check_multi, inGraph, EventDB, LConf

6

Tuesday, July 17th 2012, 5:34pm

ich hab mal ein .zip draus gemacht und angehaengt. ich mag externe urls auf logs nicht, die halten meist nicht ewig...selbiges wuerde ich dir auch fuer die zukunft empfehlen.
jetzt mal in ruhe lesen.
dnsmichi has attached the following file:
  • icinga.zip (99.03 kB - 91 times downloaded - Last download: Apr 17th 2014, 9:40am)
+++ Icinga / LConf Developer +++ Application Developer at []NETWAYS> +++ Blog +++
+++ Icinga 1.11 || Icinga 2 +++ Icinga Support +++

pitchfork

Administrator

Posts: 19,899

Location: Kassel

Occupation: Sysadmin SAP / Linux / AIX

Number of monitoring servers: 2

Hobbies: Motorrad fahren, wenns die Zeit erlaubt :-)

Nagios Version: 3.2.3 ( OMD )

Distributed monitoring: Nein

Redundant monitoring: Nein

Number of hosts: 360

Number of services: 6700

OS: Debian 6.0

Plugin Version: 1.4.x

Other Addons: SNMPTT, NagTrap, check_mk, PNP-0.6.x. Thruk

7

Tuesday, July 17th 2012, 5:56pm


Source code

1
2012-07-17 14:59:42 [30179] [0] RRDs::update /usr/local/pnp4nagios/var/perfdata/srv-vc01/Processes_-Total_Processes.rrd 1342529970:62.000000 2012-07-17 14:59:42 [30179] [0] RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/srv-vc01/Processes_-Total_Processes.rrd: illegal attempt to update using time 1342529970 when last update time is 1342529970 (minimum one second step)



Wenn sowas passiert würde ich im Log erst mal nach dem Timestamp suchen und analysieren ob die Daten wirklich zweimal reingekommen sind.

Source code

1
grep 1342529970 perfdata.log
+++ PNP Developer +++ PNP 0.6.21 ist online ! +++
+++ Threema ID NBDA3UU8 +++
OMD - Open Monitoring Distribution

stiftmaster

Beginner

Posts: 40

Location: Kassel

Occupation: Sysadmin

Number of monitoring servers: 1

Nagios Version: none

Icinga Version: Icinga 1.9.dev

Distributed monitoring: Nein

Redundant monitoring: Nein

Number of hosts: 120

Number of services: 2204

OS: Debian (wheezy) 7

Plugin Version: 1.4.16-1

Other Addons: SNMPTT, PNP4Nagios 0.6.19, sendsms+GSM Modul

8

Tuesday, July 17th 2012, 6:43pm

grep 1342529970 /usr/local/pnp4nagios/var/perfdata.log

Source code

1
2
3
4
2012-07-17 14:59:42 [30180] [0] RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/srv-ns01/Memory_Free.rrd: illegal attempt to update using time 1342529950 when last update time is 1342529970 (minimum one second step)
2012-07-17 14:59:42 [30180] [0] RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/srv-prn01/CPU_Load.rrd: illegal attempt to update using time 1342529950 when last update time is 1342529970 (minimum one second step)
2012-07-17 14:59:42 [30179] [0] RRDs::update /usr/local/pnp4nagios/var/perfdata/srv-vc01/Processes_-Total_Processes.rrd 1342529970:62.000000
2012-07-17 14:59:42 [30179] [0] RRDs::update ERROR /usr/local/pnp4nagios/var/perfdata/srv-vc01/Processes_-Total_Processes.rrd: illegal attempt to update using time 1342529970 when last update time is 1342529970 (minimum one second step)

dnsmichi

Super Moderator

Posts: 7,046

Birthday: May 30th 1983 (30)

Gender: male

Location: Nürnberg

Occupation: Application Developer beim besten Arbeitgeber der Welt @netways

Number of monitoring servers: Icinga: 2x dev, 10++ prod, Icinga2: 4x dev

Nagios Version: s/nagios/icinga/

Icinga Version: 1.11.x / 2 0.0.x / GIT next

Distributed monitoring: Ja

Redundant monitoring: Ja

Number of hosts: 1000+

Number of services: 15000+

OS: RHEL, Debian, SUSE

Plugin Version: 1.5

IDO-Version: 1.11.x / GIT MySQL/Postgresql

Other Addons: Icinga Web, PNP, check_multi, inGraph, EventDB, LConf

9

Tuesday, July 17th 2012, 6:50pm

darf ich die service definitionen zu den checks die du da auflistest, mal sehen? irgendwelche addons installiert, die in diesen checkmechanismus eingreifen?
+++ Icinga / LConf Developer +++ Application Developer at []NETWAYS> +++ Blog +++
+++ Icinga 1.11 || Icinga 2 +++ Icinga Support +++

pitchfork

Administrator

Posts: 19,899

Location: Kassel

Occupation: Sysadmin SAP / Linux / AIX

Number of monitoring servers: 2

Hobbies: Motorrad fahren, wenns die Zeit erlaubt :-)

Nagios Version: 3.2.3 ( OMD )

Distributed monitoring: Nein

Redundant monitoring: Nein

Number of hosts: 360

Number of services: 6700

OS: Debian 6.0

Plugin Version: 1.4.x

Other Addons: SNMPTT, NagTrap, check_mk, PNP-0.6.x. Thruk

10

Tuesday, July 17th 2012, 6:51pm

Macht nicht wirklich Sinn,oder?

in eckigen Klammerns steht übrigens die PID von process_perfdata.pl
+++ PNP Developer +++ PNP 0.6.21 ist online ! +++
+++ Threema ID NBDA3UU8 +++
OMD - Open Monitoring Distribution

stiftmaster

Beginner

Posts: 40

Location: Kassel

Occupation: Sysadmin

Number of monitoring servers: 1

Nagios Version: none

Icinga Version: Icinga 1.9.dev

Distributed monitoring: Nein

Redundant monitoring: Nein

Number of hosts: 120

Number of services: 2204

OS: Debian (wheezy) 7

Plugin Version: 1.4.16-1

Other Addons: SNMPTT, PNP4Nagios 0.6.19, sendsms+GSM Modul

11

Tuesday, July 17th 2012, 7:25pm

Für den srv-icinga01 z.B.
(also Addons sind keine weiter konfiguriert)

Source code

1
2
3
4
5
6
define service{
	use                             	24x7-service-check-one-times-per-minute,pnp-service
	host_name                            	srv-icinga01
	service_description             	Network Netstat - SSH Connections
	check_command				check_local_netstat!22!4!6!22
}


Source code

1
2
3
4
5
define service{
	name       				pnp-service
   	register  				0
   	action_url 				/pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$' class='tips' rel='/pnp4nagios/index.php/popup?host=$HOSTNAME$&srv=$SERVICEDESC$
}


Source code

1
2
3
4
5
6
7
8
define service{
	name					24x7-service-check-one-times-per-minute
	use					generic-service
	max_check_attempts			3
	normal_check_interva			1
	retry_check_interval			2
	register				0
}


Source code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
define service{
	name					generic-service
	active_checks_enabled			1 
	passive_checks_enabled			1
	parallelize_check 			1
	obsess_over_service			1
	check_freshness				0
	notifications_enabled			1
	event_handler_enabled			1
	flap_detection_enabled			1
	failure_prediction_enabled		1
	retain_status_information		1
	retain_nonstatus_information		1
	is_volatile				0
	check_period				24x7
	max_check_attempts			3
	normal_check_interval 			5
	retry_check_interval			2
	contact_groups				admins
	notification_options			w,c,r,s
	notification_interval			1440
	notification_period			24x7
	process_perf_data			1
	register				0
}

dnsmichi

Super Moderator

Posts: 7,046

Birthday: May 30th 1983 (30)

Gender: male

Location: Nürnberg

Occupation: Application Developer beim besten Arbeitgeber der Welt @netways

Number of monitoring servers: Icinga: 2x dev, 10++ prod, Icinga2: 4x dev

Nagios Version: s/nagios/icinga/

Icinga Version: 1.11.x / 2 0.0.x / GIT next

Distributed monitoring: Ja

Redundant monitoring: Ja

Number of hosts: 1000+

Number of services: 15000+

OS: RHEL, Debian, SUSE

Plugin Version: 1.5

IDO-Version: 1.11.x / GIT MySQL/Postgresql

Other Addons: Icinga Web, PNP, check_multi, inGraph, EventDB, LConf

12

Wednesday, July 18th 2012, 12:23am


define service{
name 24x7-service-check-one-times-per-minute
use generic-service
max_check_attempts 3
normal_check_interva 1
retry_check_interval 2
register 0
}


is das ein copypaste fehler, oder tatsaechlich so?

andere frage - wieso checkt man einen service jede minute, um dann im fehlerfall alle 2 minuten zu rechecken?
ansonsten hat sich mein verdacht, es hier mit passiven checks only zu tun zu haben, wohl in luft aufgeloest.

zum problem an sich - ich sehe um 1342534550 massig checkresults, die eingelesen werden. dazu wuerde mich die icinga.cfg interessieren, insbesondere das reaping interval.

Source code

1
# egrep -v "^#|^$" icinga.cfg


und gemessen an der ausfuehrung, und der relativ geringen latenz - der check wird scheduled und ist 5sec spaeter schon wieder in der queue. ich wuerde mal annehmen, dass das reaping interval bei etwa 5sec liegt.

Source code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[1342534555.318764] [024.1] [pid=30139] Run a few checks before executing a service check for 'Network Netstat - SSH Connections'.
[1342534555.318797] [016.0] [pid=30139] Attempting to run scheduled check of service 'Network Netstat - SSH Connections' on host 'srv-san12': check options=0, latency=0.318000
[1342534555.318819] [016.0] [pid=30139] Checking service 'Network Netstat - SSH Connections' on host 'srv-san12'...
[1342534555.318883] [016.1] [pid=30139] Check result output will be written to '/tmp/checkNmVFHY' (fd=13)

[1342534560.105679] [016.1] [pid=30139] Processing check result file: '/usr/local/icinga/var/spool/checkresults/cHCx47l'

[1342534560.119571] [016.0] [pid=30139] ** Handling check result for service 'Network Netstat - SSH Connections' on host 'srv-san12'...
[1342534560.119577] [016.1] [pid=30139] HOST: srv-san12, SERVICE: Network Netstat - SSH Connections, CHECK TYPE: Active, OPTIONS: 0, SCHEDULED: Yes, RESCHEDULE: Yes, EXITED OK: Yes, RETURN CODE: 0, OUTPUT: OK - tcp22_in is 0 | tcp22_in=0\n
[1342534560.119590] [016.1] [pid=30139] Service is OK.
[1342534560.119594] [016.1] [pid=30139] Service did not change state.
[1342534560.119602] [016.1] [pid=30139] Rescheduling next check of service at Tue Jul 17 16:16:55 2012
[1342534560.119610] [016.0] [pid=30139] Scheduling a non-forced, active check of service 'Network Netstat - SSH Connections' on host 'srv-san12' @ Tue Jul 17 16:16:55 2012
[1342534560.119675] [016.1] [pid=30139] Checking service 'Network Netstat - SSH Connections' on host 'srv-san12' for flapping...
[1342534560.119681] [016.1] [pid=30139] Service is not flapping (0.00% state change).
[1342534560.119685] [016.1] [pid=30139] Checking host 'srv-san12' for flapping...
[1342534560.119689] [016.1] [pid=30139] Host is not flapping (0.00% state change).
[1342534560.119718] [016.1] [pid=30139] Deleted check result file '/usr/local/icinga/var/spool/checkresults/cHCx47l'


sieht fuer mich also sehr normal aus - den service gibts natuerlich massig, weil er vielen verschiedenen hosts zugeordnet ist, jede relation fuer sich unique.
+++ Icinga / LConf Developer +++ Application Developer at []NETWAYS> +++ Blog +++
+++ Icinga 1.11 || Icinga 2 +++ Icinga Support +++

pitchfork

Administrator

Posts: 19,899

Location: Kassel

Occupation: Sysadmin SAP / Linux / AIX

Number of monitoring servers: 2

Hobbies: Motorrad fahren, wenns die Zeit erlaubt :-)

Nagios Version: 3.2.3 ( OMD )

Distributed monitoring: Nein

Redundant monitoring: Nein

Number of hosts: 360

Number of services: 6700

OS: Debian 6.0

Plugin Version: 1.4.x

Other Addons: SNMPTT, NagTrap, check_mk, PNP-0.6.x. Thruk

13

Wednesday, July 18th 2012, 7:17am

@Michael

lass uns erst mal auf PNP Seite analysieren.

@stiftmaster

Ich finde es komisch das rrdtool anmäckert das es bereits einen eintrob für einen gewissen Timestamp gibt, dieser aber nicht zwei mal im log auftaucht.
Weiterhin ist es komisch das es im Log Einträge zur gleichen Zeit von zwei process_perfdata.pl Prozessen gibt.

Daraufhin solltest du deine PNP Logs mal genauer untersuchen.
+++ PNP Developer +++ PNP 0.6.21 ist online ! +++
+++ Threema ID NBDA3UU8 +++
OMD - Open Monitoring Distribution

striep

Professional

Posts: 686

Birthday: Jul 20th 1962 (51)

Gender: male

Location: Buxtehude, ja, gibt es wirklich ;-)

Number of monitoring servers: 4

Hobbies: Paracord, RaspberryPi

Nagios Version: --

Icinga Version: 1.8.4, OMD 0.56 & 1.x

Distributed monitoring: Ja

Redundant monitoring: Ja

Number of hosts: 561

Number of services: 1921

OS: ubuntu, CentOS 6.x, Solaris, Windows

Plugin Version: 1.4.x

IDO-Version: 1.8.4 but not live

Other Addons: pnp 0.6.15/18, NSClient++ 0.3.8, SEC, NSCA

14

Wednesday, July 18th 2012, 9:24am

Kann es sein, dass es das gleiche wie bei mir ist?
Es wird ja auch der NPCD genutzt

Vielleicht hilft da das NPCD-Logfile weiter
~~ Never touch a running system ~~
~~ Never run a touchy system ~~

bern

Sage

Posts: 3,290

Number of monitoring servers: 2-5

Nagios Version: 3.x

Distributed monitoring: Nein

Redundant monitoring: Nein

Number of hosts: 80-200

Number of services: 1400-2000

OS: Linux

Plugin Version: Whatever I can download, patch, or cobble together myself :-)

Other Addons: n2rrd, PNP, livestatus

15

Wednesday, July 18th 2012, 9:41am

Anmerkung: Hier ist nicht überall exakt derselbe Zeitstempel unterwegs:
14:59:42 [30180] using time 1342529950 when last update time is 1342529970
14:59:42 [30180] using time 1342529950 when last update time is 1342529970
14:59:42 [30179] [0] RRDs::update 1342529970:62.000000
14:59:42 [30179] using time 1342529970 when last update time is 1342529970
(14:59:42 = 1342529982)

Also ein Geschehen, das über mehr als 30 Sekunden verteilt ist. Wenn da wirklich Intervalle von 5s u.ä. eingestellt sind, klemmt platten-/filesystemseitig 'was.

stiftmaster

Beginner

Posts: 40

Location: Kassel

Occupation: Sysadmin

Number of monitoring servers: 1

Nagios Version: none

Icinga Version: Icinga 1.9.dev

Distributed monitoring: Nein

Redundant monitoring: Nein

Number of hosts: 120

Number of services: 2204

OS: Debian (wheezy) 7

Plugin Version: 1.4.16-1

Other Addons: SNMPTT, PNP4Nagios 0.6.19, sendsms+GSM Modul

16

Wednesday, July 18th 2012, 10:08am

Morgen Monitorer,

ja das retry interval hab ich angepasst :wacko:

Anbei noch die cfg und das Log vom NPCD
stiftmaster has attached the following files:
  • icinga.cfg.zip (1.69 kB - 77 times downloaded - Last download: Apr 8th 2014, 7:16pm)
  • npcd.log.zip (208.93 kB - 68 times downloaded - Last download: Mar 20th 2014, 2:22am)

pitchfork

Administrator

Posts: 19,899

Location: Kassel

Occupation: Sysadmin SAP / Linux / AIX

Number of monitoring servers: 2

Hobbies: Motorrad fahren, wenns die Zeit erlaubt :-)

Nagios Version: 3.2.3 ( OMD )

Distributed monitoring: Nein

Redundant monitoring: Nein

Number of hosts: 360

Number of services: 6700

OS: Debian 6.0

Plugin Version: 1.4.x

Other Addons: SNMPTT, NagTrap, check_mk, PNP-0.6.x. Thruk

17

Wednesday, July 18th 2012, 10:10am

Ich möchte nicht deine Arbeit machen und deine Logs analysieren.

Wir haben dir doch schon sehr viele Infos geliefert.
+++ PNP Developer +++ PNP 0.6.21 ist online ! +++
+++ Threema ID NBDA3UU8 +++
OMD - Open Monitoring Distribution

stiftmaster

Beginner

Posts: 40

Location: Kassel

Occupation: Sysadmin

Number of monitoring servers: 1

Nagios Version: none

Icinga Version: Icinga 1.9.dev

Distributed monitoring: Nein

Redundant monitoring: Nein

Number of hosts: 120

Number of services: 2204

OS: Debian (wheezy) 7

Plugin Version: 1.4.16-1

Other Addons: SNMPTT, PNP4Nagios 0.6.19, sendsms+GSM Modul

18

Wednesday, July 18th 2012, 10:15am

Ich habe mir "das" Log auch angesehen, die cfg wollte dnsmichi sehen

striep

Professional

Posts: 686

Birthday: Jul 20th 1962 (51)

Gender: male

Location: Buxtehude, ja, gibt es wirklich ;-)

Number of monitoring servers: 4

Hobbies: Paracord, RaspberryPi

Nagios Version: --

Icinga Version: 1.8.4, OMD 0.56 & 1.x

Distributed monitoring: Ja

Redundant monitoring: Ja

Number of hosts: 561

Number of services: 1921

OS: ubuntu, CentOS 6.x, Solaris, Windows

Plugin Version: 1.4.x

IDO-Version: 1.8.4 but not live

Other Addons: pnp 0.6.15/18, NSClient++ 0.3.8, SEC, NSCA

19

Wednesday, July 18th 2012, 10:19am

Source code

1
2
3
4
5
6
7
8
[07-17-2012 14:59:42] NPCD: A thread was started on thread_counter = 3
[07-17-2012 14:59:42] NPCD: Have to wait: Filecounter = 4 - thread_counter = 4
[07-17-2012 14:59:42] NPCD: Processing file host-perfdata.1342529980 with ID 140098671748864 - going to exec /usr/local/pnp4nagios/libexec/process_perfdata.pl -n -b /usr/local/pnp4nagios/var/spool//host-perfdata.1342529980
[07-17-2012 14:59:42] NPCD: Processing file 'host-perfdata.1342529980'
[07-17-2012 14:59:42] NPCD: Processing file service-perfdata.1342529980 with ID 140098652776192 - going to exec /usr/local/pnp4nagios/libexec/process_perfdata.pl -n -b /usr/local/pnp4nagios/var/spool//service-perfdata.1342529980
[07-17-2012 14:59:42] NPCD: Processing file 'service-perfdata.1342529980'
[07-17-2012 14:59:42] NPCD: Processing file service-perfdata.1342529965 with ID 140098661168896 - going to exec /usr/local/pnp4nagios/libexec/process_perfdata.pl -n -b /usr/local/pnp4nagios/var/spool//service-perfdata.1342529965
[07-17-2012 14:59:42] NPCD: Processing file 'service-perfdata.1342529965'


[07-17-2012 14:59:42] NPCD: Have to wait: Filecounter = 4 - thread_counter = 4

IMHO musst du mal den Thread count hochdrehen.
~~ Never touch a running system ~~
~~ Never run a touchy system ~~

pitchfork

Administrator

Posts: 19,899

Location: Kassel

Occupation: Sysadmin SAP / Linux / AIX

Number of monitoring servers: 2

Hobbies: Motorrad fahren, wenns die Zeit erlaubt :-)

Nagios Version: 3.2.3 ( OMD )

Distributed monitoring: Nein

Redundant monitoring: Nein

Number of hosts: 360

Number of services: 6700

OS: Debian 6.0

Plugin Version: 1.4.x

Other Addons: SNMPTT, NagTrap, check_mk, PNP-0.6.x. Thruk

20

Wednesday, July 18th 2012, 10:22am

Wir reden aber andeuernd vom perfdata.log das von process_perfdata.pl erzeugt wird.

Zum Hntergrund:

rrdtool kann nur neue Werte an ein RRD File anhängen.

Quoted

using time 1342529950 when last update time is 1342529970


hier sollen also werte für 1342529950 geschriebn werden, es sind aber schon werte für 1342529970 geschriebn worden.

Daher auch Jochens und Stefans Vermutung das process_perfdata.pl auf deinem System etwas aus dem Tritt kommt.
Möglichwerweise zu hoher IO Wait oder schlicht schlechtes Timing
+++ PNP Developer +++ PNP 0.6.21 ist online ! +++
+++ Threema ID NBDA3UU8 +++
OMD - Open Monitoring Distribution