Cloud-based overhaul for on-demand telemetry data collection

By Chethana Vedanthi

Web-based, microservices-based application architectures are playing a major role for IT teams that are supporting enterprise applications today, especially NetApp IT business app developers.

Using their DevOps framework, the IT team set out to build cloud-native apps using microservices architectures running in containers. They started by rebuilding NetApp’s award-winning Support site, the main channel for NetApp customers who need help and information. The site logs more than 350,000 unique visitors per month. The team then tackled a new case management system for hundreds of technical support engineers who troubleshoot 15,000 customer cases each month.

The next challenge was akin to the story of the cobbler’s children who had no shoes. The team needed to address a very real problem for the NetApp IT organization itself: a transaction-heavy back-end app that enables the submission and tracking of on-demand requests for specific diagnostic NetApp® AutoSupport® messages. AutoSupport OnDemand had become unreliable, required frequent restarts, and was past due for an extensive upgrade—a major pain for IT operations and application support.

At the same time, the technical support engineers who relied on AutoSupport OnDemand data were frustrated with the application’s poor response time and unreliability. The application was hindering their ability to quickly troubleshoot issues, provide diagnostic advice, and perform routine health checks for customers.

The solution

The IT business applications team rewrote the AutoSupport OnDemand application using microservices running in containers on CloudOne, ourinternal DevOps platform. The application now processes 300,000 messages per hour from NetApp file servers installed at customer locations. If necessary, it can scale to accommodate a fourfold increase in traffic levels.

The application is composed of seven microservices that provide:

– Transfer from file server to AutoSupport OnDemand

– History processing

– History scheduler

– Email notification

– REST services (for consumption by other applications)

– API gateway

– Failure processing

The AutoSupport OnDemand app uses the autoscaling capability of CloudOne to dynamically increase and decrease VMs that support the application. Scalability is important to adapt to demands for support and to adjust polling intervals that determine how frequently the application polls for new messages. The shorter the polling interval, the more frequently the application checks for new messages and the greater the impact on performance. Today the weekend default polling interval is 60 minutes, compared to the interval of the old system, which was 360 minutes on weekends—and there has been no impact to performance. Message failures are extremely rare now that the new application is in production.

AutoSupport OnDemand has become the secret messaging agent between NetApp technical support engineers and the AutoSupport enabled systems deployed worldwide. In its role as messaging agent, AutoSupport OnDemand is now robust enough to accommodate new or expanded capabilities that are driven by the needs of the business and NetApp product growth.

Tags: ,