The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.
Your payment systems have slowed to a crawl, customers are getting impatient and abandoning their shopping carts both online and in stores, and you’re losing money every minute this problem goes on. Behind the scenes, technical responders are scrambling to resolve the issue before it impacts more customers—and before even more money is lost.
More and more companies move business critical communication instruments into a cloud based environment. This could be established in a partner datacenter or in a public cloud environment. The main deciding factors between these two options are the trust to the provider and the costs of the solution.
How many servers can be managed by one system administrator? This question is pretty hard to answer since it depends decisively on the tasks that need to be operated. It is clear, however, that the amount of servers one engineer can manage has increased tremendously over the time, and is still growing. Public and private clouds, in combination with automation tools, enables us to automate many daily tasks. In a modern IT infrastructure almost everything can, and should, be automated.
Email alerting is an inefficient way to receive and address critical alerts. Email inboxes tend to get flooded with “clutter,” as irrelevant messages bury urgent incident notifications. Incident management procedures require incident management systems, ensuring that urgent issues are immediately addressed. Yet, some services are reluctant to say goodbye to email alerting and its inefficiencies. This is the case with Google Voice, which recently solidified its commitment to email alerting.
In our recent “IT Ops Demystified – Event Chaos or Enrichment?” webinar our field CTOs discuss how enrichment can help reduce operational costs by an order of magnitude. Here is a quick overview of all the goodness that you’ll be watching.
Site Reliability Engineering (SRE) is a practice for managing the reliability of systems that began at Google in the early 2000s. Ben Treynor Sloss from Google started the first SRE team and coined the name.
I can tell you the day I knew I would be a Systems Administrator (the term SRE hadn’t been invented yet.) My Linux professor, a brilliant engineer at NASA, said: "The best system administrators are the laziest." He went on to qualify that statement but I had stopped listening. My fate was sealed.