Download as PDF

Enterprise Strategy Group | Getting to the bigger truth.™

Distributed Cloud Series:

Observability from Code to Cloud

Scott Sinclair, Practice Director

Rob Strechay, Senior Analyst

february 2022

table of contents

click to follow

Research Introduction and Objectives

There Is Near-universal Pressure on Organizations to Accelerate Operations

As Observability Practices Expand, Complexity and Tool Sprawl Increase

AIOps and Cloud Cost Optimization Tools Burgeon to Accelerate IT and DevOps

Research Objectives

Organizations continue to try to strike the balance between cloud-native and legacy infrastructure. Whether organizations take a “cloud-first” or a “cloud-when-it-makes-sense” approach to their digital transformation initiatives, the number and variety of infrastructure options and locations continue to expand. Specifically, IT operations teams continue to strive to improve collaboration with developers on building modern application architectures and establishing the related processes. As companies accelerate or embark on their digital transformation journeys, what is the expected role of ITSM in enabling businesses to realize the benefits of automation, observability, intelligence, and optimization?

To gain insight into these trends, ESG surveyed 357 IT, DevOps, and application development professionals at organizations in North America (US and Canada) responsible for evaluating, purchasing, managing, and building application infrastructure.

This study sought to:

Understand the current state of observability and AIOps-driven automation.

Determine the benefits of automation coupled to observability and intelligence.

Investigate IT operations and developer team interaction and burden of processes.

Understand the future plans and outlook for observability solutions.

Note: Totals in figures and tables throughout this eBook may not add up to 100% due to rounding or organizations choosing more than one answer to select questions.

There Is Near-universal Pressure on Organizations to Accelerate Operations

Almost All Have Had to Accelerate Operations over the Last Three Years, with Increased Variety of Applications Being the Biggest Hurdle to These Efforts

Given the speed with which companies must move to stay ahead of, or at least on pace with, their competition in an overwhelmingly digital business landscape, it is not surprising that 99% of respondent organizations have had to accelerate IT operations over the last three years. Indeed, on average, organizations have had to deploy applications and infrastructure 56% faster than they did three years ago. This has led to challenges for organizations trying to gain visibility into their applications. Specifically, the five most common challenges in accommodating accelerated operations highlight the cost of more fractured and siloed development techniques built on disaggregated IT infrastructure. This is the “DevOps” effect, in which development is pushed as close to the business, and even customers, as possible, while simultaneously creating a divide between applications and their underlying infrastructure.

Rate at which applications, infrastructure, and services need to be deployed compared with three years ago.

Biggest challenges for IT to overcome in order to accelerate operations.

39%

Increased variety of application types

27%

Time necessary to ensure coordination among IT and application development or other line-of-business teams

27%

Lack of visibility into how the application environment maps to the underlying infrastructure environment

27%

The disaggregated nature of IT infrastructure

26%

Security

Automation and Observability Lead Efforts to Accelerate Operations

What are IT organizations doing to overcome the challenges they face in accelerating the operations underlying their companies’ business processes and objectives? Nearly half (47%) are responding to this increased pressure to accelerate both IT operations and DevOps activities by investing in IT automation and AIOps solutions. Additionally, 44% cited the increased adoption of observability and monitoring tools. Other common measures included more localized activities such as investing more in public cloud services and cloud-native architectures and focusing on data center modernization activities

Nearly half (47%) are responding to this increased pressure to accelerate both IT operations and DevOps activities by investing in IT automation and AIOps solutions.”

Measures IT is taking to accelerate operations.

Increasing adoption/prioritization of IT automation or AIOps

47%

Increasing adoption/prioritization of observability and monitoring tools

44%

Increasing investment in public cloud services

42%

Increasing investment in cloud-native architectures

41%

Modernizing data center infrastructure to provide improved consolidation/simplified operations

38%

Increasing hiring of IT personnel

37%

Internal executive-level pushback for requirements to allow for sufficient time for proper due diligence

32%

Using third-party managed service providers

31%

As Observability Practices Expand, Complexity and Tool Sprawl Increase

Organizations Are Focusing on Results, Not Green Dashboards

Most organizations understand that having a “green dashboard” in observability applications is not enough anymore. While 17% are more focused on “green” metrics, the remainder of organizations map their observability outcomes to business results. Specifically, nearly half are more customer focused, relying on KPIs related to either customer satisfaction (37%) or net promoter scores (12%). The additional 34% were focused on revenue impacts.

Dominant measure of application/user experience optimization as a result of better observability.

Nearly half are more customer- focused, relying on KPIs related to either customer satisfaction or net promoter scores.”

Even with More Tools, Troubleshooting Is Burdensome

The continued complexity of tool sprawl is real and has no end in sight, with two-thirds of organizations leveraging more than ten different observability tools, averaging out to 12 per organization. This is impacting the ability to maintain and troubleshoot internally developed, custom applications, with more than half of organizations identifying the correlation of metrics data from multiple tools as burdensome (28%) or extremely burdensome (26%). The disparate tool situation also leads to a corresponding excess of alerts and logs, making the identification of valuable information a burdensome (36%) or extremely burdensome (22%) task.

Total observability tools used to collect data from application environment.

Level of burden created by multiple observability tools when maintaining/troubleshooting internally developed, custom applications.

Extremely burdensome
(i.e., major headache that consumes multiple people)
Burdensome
(i.e., complex and time consuming)

Top Observability Priorities Include Seeing the Entire Application Stack for Performance, Root Cause, and Security

When organizations develop their observability strategy, the focus is typically on creating visibility across the entire application supply and support chains. Indeed, nearly half (48%) of respondents identify prioritizing real-time insights to ensure SLA and performance commitments are met and/or accelerating fault isolation, root cause analysis, and resolution as their most important observability priorities.

To achieve these goals, organizations are using a wide variety of monitoring and observability tools. However, nearly two-thirds (65%) of decision makers are turning to third-party tools and services to help them achieve these objectives.

Most important IT monitoring and observability priorities.

Providing real-time insights into application and/or infrastructure environments to ensure that service level agreement and performance commitments are met

48%

Providing insights into application and/or infrastructure environments to assist with tracing, accelerated fault isolation, root cause analysis, and resolution

48%

Providing insights to improve security posture/help with vulnerability detection and impact analysis

41%

Providing insights into application and/or infrastructure environments to automate operations

35%

Providing insights into resource cost attribution and cost optimization

34%

Providing digital experience or end-user monitoring

33%

Ensuring adherence to regulatory compliance requirements

32%

65% of respondents use third-party tools or cloud services for monitoring or observability.

Observability at Scale Is Still a Struggle

One of the attributes of modern application development is to build and run anywhere at scale, especially in a containerized, microservices-based architecture. But scalability and reliability are hard to automate effectively, and organizations are struggling to instrument the applications. Operating at scale is always a tough achievement for even the most sophisticated organizations. This is the case not only for the applications themselves, but also during initial deployments and ongoing support of observability solutions.

Most common challenges deploying observability solutions.

Most common challenges supporting observability solutions.

Scalability and reliability of solutions

27%

Automation too complex to implement effectively

25%

Cultural resistance from organizationally separate teams

25%

Instrumenting applications and infrastructure for observability too time consuming

24%

Lack of visibility into edge environments and remote locations

24%

Scalability and reliability of solutions

23%

Inability to correlate data from multiple sources in a timely fashion

22%

Lack of visibility into cloud-native, container-based application environments

22%

Pace of technology change too difficult to effectively manage

20%

Lack of visibility across public and/or hybrid cloud environments

20%

Prioritizing severity of issues/alerts when presented

20%
Observability Benefits Beyond IT: More Secure with Higher Quality Software

Organizations are looking to observability to help with collaboration between different functions. In terms of security, organizations are looking to move toward building a tighter alignment between security teams and application developers, and more than half (53%) report that observability tools have improved collaboration between these groups. When it comes to improving application development environments, observability tools are linked to quality control. Specifically, more than one-quarter (29%) have realized improved software quality, while 28% have seen better problem management and root cause analysis from their observability solutions.

Top cybersecurity benefits realized from observability solutions.

Top app-dev benefits realized from observability solutions.

Improved collaboration between security team and application developers

53%

Improved ability to detect security-related signals in observability data

52%

Improved operational efficiency

47%

Ability to better prioritize vulnerabilities

46%

Elimination of separate observability and security controls

43%

Improved software quality

29%

Better problem management and root cause analysis

28%

Improved ability to make ongoing application updates

26%

Reduced mean time to resolution

26%

Improved on-call experience for engineers

25%

Accelerated application rollout/deployment times

25%

AIOps and Cloud Cost Optimization Tools Burgeon to Accelerate IT and DevOps

Organizations Are Using Many Tools Today and Will Deepen Those Investments Over the Next Year

With the average organization utilizing 12 different observability tools, many seem to be taking a “belt and suspenders” approach to monitoring, with IT operations and DevOps groups often using their own tool sets. In terms of the types of monitoring and observability tools and services in use today, more than half (52%) cite cloud monitoring. The other most commonly leveraged solutions include network performance (45%), security (44%), log analytics (42%), and application performance (40%) monitoring. While most of these areas are expected to remain flat in terms of net-new or additional investments over the next 24 months, more than one-quarter expect to spend on AIOps technology

Monitoring and observability tools and services.

Cloud monitoring
52%
42%
Network performance monitoring
45%
34%
Security monitoring
44%
39%
Log analytics/monitoring
42%
26%
Application performance monitoring
40%
34%
End-user monitoring
38%
29%
API monitoring
37%
37%
Infrastructure monitoring
35%
30%
Cost optimization
34%
31%
ITSM
31%
28%
Distributed tracing
28%
22%
AIOps
22%
28%

Third-party monitoring/observability tool functional capabilities used today

Monitoring/observability tools or services organizations will add or further invest in over the next 24 months

AIOps Means Many Different Things to Organizations

This research made it clear that AIOps means a lot of different things to different organizations. Artificial intelligence for IT operations (AIOps) is an umbrella term for the use of analytics, machine learning (ML), and artificial intelligence (AI) technologies to automate the identification and resolution of common IT issues. AI is typically used to find linkages between systems that were not seen before due to their use of data that is not formatted to be ingested. While respondents report a variety of benefits, the most commonly cited include hardware infrastructure optimization (39%), alerting DevOps (38%), and application placement and issue resolution (38%). Organizations will have to invest the time to determine which AIOps solutions best align with their use cases.

Benefits realized from AIOps.

We have deployed infrastructure systems with artificial intelligence integrated to improve optimization within the system
39%
We leverage AI to help support DevOps with automatically correlated alerts and events
38%
We have tools/systems in place that leverage AI to provide recommendations for application placement and to accelerate issue resolution
38%
We leverage AI to help support DevOps with root cause analysis guidance
37%
We leverage AI-enhanced observability tools that interpret metrics to anticipate the timing and severity of potential issues
36%
We leverage AI for IT system log file error analysis
35%
We leverage AI to help protect against cybersecurity threats
33%
We leverage AI to help support DevOps with accelerated anomaly detection
32%
We leverage AI to provide predictive or preventative maintenance
30%
We have tools/systems in place that leverage AI to automate provisioning of new infrastructure resources
28%
We leverage AI for text-based automated bots /chatbots
27%

Broad and Recent Adoption of Cloud Cost Optimization Tools

Cloud cost optimization is an activity that organizations are increasingly pursuing to ensure their investments in these services are living up to the economic benefits often promised as part of their value proposition. This is becoming an even more important consideration as more organizations pursue hybrid cloud strategies in which they are constantly weighing whether to run their applications on public cloud infrastructure or in their own on-premises data centers. Indeed, 86% of respondent organizations are currently leveraging a third-party tool to optimize cloud costs, and nearly half (46%) have been doing so for more than a year. Among these organizations using cloud optimization technology, 94% realized significant initial savings with these tools, and nearly half (45%) continued to see significant savings in subsequent months.

Cloud cost optimization tool use.

We have been using a cloud cost optimization tool for over 12 months
We have started using a cloud cost optimization tool in the last 12 months

Realization of cloud cost optimization tool benefits.

We realized significant initial savings and continue to realize significant savings each subsequent month
We realized significant initial savings and then modest savings over time

For almost 40 years, Cisco has inspired new possibilities by reimagining applications and transforming infrastructure. Built on this legacy, Calisti Service Mesh Manager automates lifecycle management and simplifies connectivity, security, and observability for microservices-based and cloud-native applications.

Got 90 seconds? Click below to find out why Cisco customers call Calisti a game-changer.

help me stop micromanaging my microservices!

Research Methodology

To gather data for this report, ESG conducted a comprehensive online survey of IT, DevOps, and application development professionals from private- and public-sector organizations in North America (United States and Canada) between November 15, 2021 and November 20, 2021. To qualify for this survey, respondents were required to be personally responsible for evaluating, purchasing, building, and managing application infrastructure. Additionally, all qualifying organizations were required to employ, or plan to employ, an observability practice. All respondents were provided an incentive to complete the survey in the form of cash awards and/or cash equivalents.

After filtering out unqualified respondents, removing duplicate responses, and screening the remaining completed responses (on a number of criteria) for data integrity, we were left with a final total sample of 357 IT, DevOps, and application development professionals.

Respondents by Number of Employees

Respondents by Age of Company

Respondents by Industry