Operations | Monitoring | ITSM | DevOps | Cloud

%term

You Can Solve the Application Waste Problem

If you’re like most companies running large-scale data intensive workloads in the cloud, you’ve realized that you have significant quantities of waste in your environment. Smart organizations implement a host of FinOps activities to ameliorate or address this waste and the cost it incurs, things such as: … and the list goes on. These are infrastructure-level optimizations.

How to Display Grafana Alerts to Your Dashboards | Grafana

💡 Did you know you can display Grafana alerts on your dashboards? Join Senior Developer Advocate Marie Cruz in this quick tutorial to learn how to configure a Grafana alert and link it to your dashboard and panel. ☁️ Grafana Cloud is the easiest way to get started with Grafana dashboards, metrics, logs, and traces.

Handling IT Procurement And Inventory of New Hardware And Software

Software and hardware acquisition is the first stage of the IT asset lifecycle and should be done cost-effectively and organized. To achieve that, you must set in place a well-defined procurement process. With it, you'll have a unified and normalized inventory that lets you know your IT assets in stock and their location, thus providing reliable insights on what you need to purchase and what is vacant or can be relocated.

Observe, Automate and Optimize | SolarWinds Day Virtual Event

You can’t manage what you can’t monitor and observe. IT ecosystem complexity is a part of operating in a hybrid multi-cloud, containerized microservices, digital transformation world, and the complexity is not magically going away. This virtual event shows how SolarWinds is solving what others can’t – abstracting the complexity, increasing visibility, and automating remediation across on-premises, hybrid, and cloud-native estates.

Navigating IT Incidents - The Role Of The Status Page

At any moment, a small failure at any point in your complex web of IT systems can trigger an outage. As such, proactively establishing a method of clear and timely end user communication is the crux of effective incident response. For large organizations, these moments of downtime not only carry a massive opportunity cost, but also test the resilience of their operations.

Easy Guide to Monitor Jenkins Jobs Using Telegraf and MetricFire

Monitoring Jenkins jobs and nodes is foundational to maintaining a robust, efficient, and secure CI/CD pipeline. It enables DevOps teams to stay proactive about system health, optimize performance, manage resources effectively, and adhere to security and compliance standards. In this article, we'll detail how to use the Telegraf agent to collect performance metrics from your Jenkins environment, and forward them to a datasource.

Introducing Process Exhaustion: How to scale your services without overwhelming your systems

We rarely think about how many processes are running on our systems. Modern CPUs are powerful enough to run thousands of processes concurrently, but at what point do our systems become oversaturated? When you’re running large-scale distributed applications, you might reach this limit sooner than you'd expect. How can you determine what that limit is, and how does that affect the number and complexity of the workloads you deploy?