Operations | Monitoring | ITSM | DevOps | Cloud

Expand Kubernetes Monitoring with Telegraf Operator

Monitoring is a critical aspect of cloud computing. At any time, you need to know what’s working, what isn’t, and have the ability to respond to changes occurring in a given environment. Effective monitoring begins with the ability to collect performance data from across an ecosystem and present it in a useful way. So the easier it is to manage monitoring data across an ecosystem, the more effective those monitoring solutions are and the more efficient that ecosystem is.

Best Practices to implement in Incident Management

They are like 5 stages of an incident: 1. Assess impact 2. Inform customers (statuspage) 3. Identify the issue 4. Mitigate the issue 5. Resolve the incident Then there’s followup and further work. Also important to note that (2) should be ongoing as you progress. Updating the status page should be done within reasonable periods – e.g. every 15-20 mins unless you specify otherwise.

Top 10 Data Center Management Trends of 2021

What a difference a year makes. Each year, data center, lab, and edge sites become more complex, more distributed, and more difficult to manage. Managers must stay up to date on the latest trends and advancements in data center management best practices and technologies to maintain uptime, increase the efficiency of capacity utilization, and improve the productivity of people.

What can SREs do to make holiday season's peak traffic less chaotic?

Holiday season's peak traffic is the most challenging period for SREs and on-call engineers. In this blog, we have highlighted the things that SREs can do to make the holiday season less chaotic. The recently concluded Black Friday weekend could have potentially been the most challenging shift for on-call engineers working in the Retail or E-Commerce sector. Since such peak-traffic events push the system to the limits, engineering teams are engulfed in a lot of tension preparing for it.

New Full Page Check upgrades support end user demand for richer features

Software upgrades are typically about offering new upgrades and improvements that enhance the end user experience, offer greater efficiency, and provide a more feature-rich product. These are also some of the reasons why Uptrends has released a new version of the Full Page Check monitor, which offers lots of benefits over the previous version. The demand for more metrics has grown over time not only for how the elements load but also how the page is presented to the end users.

How to Use Rapid Experimentation To Improve Big Data Adaptability?

Big Data, a serious shift toward rapid experimentation, is the need of the day for most firms who are interested in reaping its potential benefits and constructing a wise and clear path to the changeover. Big Data has been the topic of discussion for a few years and is a method for businesses to acquire large amounts of data on their customers in order to deal with that data while respecting customer privacy and adhering to ethical guidelines.

State of IT Management Survey Report 2020-21

As we continue to adapt following the pandemic, which has impact us all both personally and professionally, we take this moment to commemorate the IT veterans we've lost to the pandemic. With the pandemic drastically changing the way we do business, we have conducted a study to understand the state of IT management at the height of these radical changes and analyzed how to offer a holistic approach to changing IT management needs to prepare for the post-pandemic IT world.