Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on AIOps, alerting in complex systems and related technologies.

Topology for Incident Causation and Machine Learning within AIOps

Our thinking and use of topology within AIOps and Observability solutions from Broadcom has advanced significantly in recent years, while solidly building on our innovative domain tools. We’re providing a blog post series to communicate these innovations, advancements, and benefits for IT operations. In this blog post, we continue where the previous blog post left off.

Networking Field Day 35: Selector AI Demo Part 2

In this demo, a user leverages Selector's Conversational AI, Selector Copilot, to investigate performance within their network infrastructure. The user first probes into the health of tenants located in a specific geographic region. Selector Copilot provides a visualization of the current state and summarization of the overall condition and afflicted tenants, along with probable root cause. The user then interacts with Selector Copilot to explore resource allocation, historical usage, and projected bandwidth. Each visualization provided by Selector Copilot can be copied and pasted onto a dedicated dashboard.

Networking Field Day 35: Democratization of Data Access Using Network LLMs with Selector AI

In this brief demo of the Selector platform, a user interacts with Selector Copilot to explore behavior within their network infrastructure. They first look into the latency of their transit routers, revealing a regional issue. The user drills down into network topology information to further investigate the latency, where they access details about devices, interfaces, sites, and circuits. Selector Copilot is then leveraged to surface circuit errors. Notably, each visualization provided by Selector Copilot can be copied and pasted onto a dedicated dashboard.

Networking Field Day 35: Selector AI Alerting Discussion with Nitin Kumar

Selector delivers consolidated, actionable alerts through your preferred collaboration platform, such as Slack or Teams. Alerts depend on Selector's powerful event correlation fueled by advanced AI/ML techniques. Automations can be leveraged to generate service tickets that include detailed summaries, root cause analysis, and even suggested remediations.

Why Observability is Critical to Cyber Resilience

Whether an enterprise operates in technology, healthcare, financial services, or another business vertical, cybersecurity must remain top of mind. In addition to the numerous international cybersecurity regulations, like the NIST Cybersecurity Framework, GDPR, and other mandates, enterprises must also prioritize cybersecurity to mitigate downtime, protect sensitive data, and uphold customer trust and brand reputation.

Networking Field Day 35: Selector AI Introduction with Debashis Mohanty

Selector's customer base includes 50 deployments across service providers as well as large enterprises in retail, media distribution, colocation services, and multi-cloud networking services. These customers aim to correlate events across their network, applications, and infrastructure; eliminate the need for human intervention in RCS and remediation; and democratize access to insights using conversational natural language interfaces. Selector delivers on these outcomes, while accelerating incident remediation through smart, actionable alerting and a GenAI-based conversational interface.

Networking Field Day 35: Solving the Query Problem with Selector AI

Selector translates English phrases to SQL queries through the use of an LLM. Each SQL query includes the table, or data set to be searched, along with filters, or conditions which prune the search results. We walk through a number of SQL queries and sample search results, before considering the LLM-based translation of a sample English phrase processed by Selector.

Networking Field Day 35: Selector AI and the Workings of an LLM

An LLM differs from a function in that it takes output and imputes, or infers, a function and its arguments. We first consider how this process works within Selector for an English phrase converted to a query. We then step through the design of Selector's LLM, which relies on a base LLM trained with English phrases and SQL translation, then fine-tuned, on-premises, with customer-specific entities. In this way, each of Selector's deployments relies on an LLM tailored to the customer at hand.

AI-powered incident management copilots: A guide

All eyes are on generative AI. Enterprise IT teams are looking to Gen AI to translate the high volume of data from their services architecture into actionable insights. The goal: Improve operational efficiency and quality of work. But it’s challenging to sort through the hype (and confusion) to identify which vendors have GenAI capabilities that can provide true impact and value to their IT and service operations. One capability in particular is AI-powered copilots.

Improving documentation with content reuse

Anyone who’s worked in a customer-facing role knows the pressure to find the correct answers quickly. Emotions are high when something is broken, or there’s an outage. The customer is angry. You’re stressed. And your boss is watching and wondering why the problem hasn’t been fixed. You need to troubleshoot quickly and provide the right information ASAP. As a support professional, you want to give customers and stakeholders the best possible experience.