Some people, when confronted with a problem, think “I know, I'll send an email w...

marcosdumay · on Feb 10, 2014

I really don't get where you are going with that.

Are you arguing that alerts are useless, and we must fix the issue for once? Because if so, I'd point that some things can not be fixed (because the Earth is finite, we don't know all things, etc) and you are better alerted sooner, rather than later.

Now, if you are arguing that email is not the right medium for an alert, well, what medium is better? Really, I can't think of any single candidate. Yeah, email may go down, that's why you complement it with some system external to your network (a VPS is cheap, a couple of them in different providers is almost flawless, and way cheaper than any proprietary dashboard). Yes there is some delay involved, that should be of a few minutes at most, because you create some addresses specifically for the alerts, and make all hell break loose then a message gets there. Some standard IM protocol that federated between all your net (and external point of control), could be reached from anywhere, and had plenty of support on all kinds of computers would be better, but it does not exist.

alister · on Feb 11, 2014

I got the GP's point immediately: He means that system administrators already get an enormous volume of email. Send them another email and it'll get ignored, deleted, or put at the bottom of a gigantic to-do list.

For airline pilots, an excessive number of warnings themselves (bells, alarms, audible warnings) are known to distract the pilots and cause errors.

aryastark · on Feb 11, 2014

I think you're being obtuse.

Once you start sending emails for things, you start sending emails for everything. It's easy to fall into the trap of not accurately categorizing what is critical (like real, real, critical, I mean it this time guys!) and what are merely statuses. So what happens is everything starts being ignored, and your systems become obscure black boxes again.

hueving · on Feb 11, 2014

I think you were the one being obtuse. There is no assumption that you will start receiving useless email status updates. In fact, most reasonable monitoring tools only email when a status changes to a problem state.

dredmorbius · on Feb 11, 2014

most reasonable monitoring tools

20+ years of experience tells me most monitoring tools aren't reasonable.

hueving · on Feb 12, 2014

Then don't use them? My point is that there is nothing wrong with email alerts, so the statement about them being a problem sounds like a misconfiguration or a failure to understand how to setup email filters.

dredmorbius · on Feb 13, 2014

there is nothing wrong with email alerts

You're wrong.

As a sysadmin, I typically receive something on the order of 1,000 to 10,000 emails daily (the specifics vary by the system(s) I'm admining). Staying on top of my email stream is a significant part of my job, both in not ignoring critical messages which have been lost, misfiled, or spamfiltered, and in getting bogged down in verbose messages which convey no real information.

Alerts which tell me nothing have a negative value: they obscure real information, they don't convey useful information, and each person who comes on to the team has to learn that "oh, those emails you ignore", write rules to filter or dump them, etc.

Worse: if the alerts might contain useful information, that fact has to be teased out of them.

The problem with emails such as that is that they're logging or reporting data. They should be logged, not emailed, and with appropriate severity (info, warning, error, critical). Log analysis tools can be used to search for and report on issues from there.

As I said: in a mature environment, much of my work goes into removing alerts, alert emails, etc., which are well-intentioned but ultimately useless.

hueving · on Feb 13, 2014

>As a sysadmin, I typically receive something on the order of 1,000 to 10,000 emails daily

Sorry, but you're not a very good sysadmin then. You have chosen poor tools or do not understand how to distill the information. Knowing that, I can see why you think email alerts don't work. They are effectively broken FOR YOU.

ersii · on Feb 15, 2014

And you don't think vendors have a responsibility to reflect upon the way they do alerts and/or service monitoring?

It's usually not the system administrators that get to decide what the Corporate Overlords purchases or who they do business with. So I think it's pretty unfair to blame the admins for "choosing poor tools".

InclinedPlane · on Feb 11, 2014

The point being: delegating prioritization and categorization to a human in real-time is lazy and dangerous. As much as possible humans should only receive notifications when something requires action or is too complex to determine that programatically.

rhizome · on Feb 11, 2014

Some standard IM protocol that federated between all your net (and external point of control), could be reached from anywhere, and had plenty of support on all kinds of computers would be better, but it does not exist

I would recommend an SMS sent via GSM modem for out-of-band emergency notifications.

tobych · on Feb 12, 2014

Or a service like Twilio, with an HTTP API for this.

dredmorbius · on Feb 11, 2014

Hospitals have a similar problem -- too many devices with too many alarms. As many as 10,000/day in a busy nursing floor.

NPR covered this a few days back, I've written on it at more length:

http://www.npr.org/blogs/health/2014/01/24/265702152/silenci...

http://www.reddit.com/r/dredmorbius/comments/1x0p1b/npr_sile...

rmc · on Feb 10, 2014

"What if the email goes down? I know I'll send an email"

mafro · on Feb 11, 2014

Keyboard missing, press F1 to continue

dredmorbius · on Feb 11, 2014

That's actually a case where sending a regular ping mail to several sentinal systems which report on the LACK of an email can be useful.

Reminds me of a few times the email queues got backed up to hell and beyond. Fuck you, Yahoo.