SRE Monthly Guide For 2024 Cyber Five Readiness
In 2023, online sales rose to $38B, a 7.8% increase over the previous year for the Cyber 5. Interest rates were down to 3.1%, around half or greater than the past previous years. Trends are moving away from pandemic changes like curb pick up decreasing and mobile on the rise - instead of desktop due to WFH workers.
The holidays are a critical time for revenue gains, a 2023 Constant Contact study indicated that 50% of US SMBs report holiday sales make up at least 25 percent of their annual revenue. It is even more significant for retail SMBs who are 2x as likely to rely on the holidays for at least 50% of their yearly revenue. Overall, US retail companies make about 13% of their annual revenue over the holidays per Statistica 2022 numbers.
Sophisticated cyber threats rise
The other key issues is that traffic can be 10x typical traffic on key sales day like Black Friday. Unfortunately not all site visitors are interested prospects or paying customers - some are benign and malicious bots. Akamai recorded 5 trillion bot requests over 15 months (Jan 1 2022 - March 31 2023) for commerce companies and malicious bots rose in the holiday period.
Charting a year-long readiness plan
Photo by Kyrie kim on Unsplash
eCommerce infrastructure teams face tough challenges in providing reliable, smooth experiences during sales spikes. A continuous 12-month plan can ensure systems stay resilient despite rising complexity. This also can improve the customer journey, drive better conversion rates and revenue, and reduce stress on the infrastructure and the teams that keep everything up and running.
Key lessons learned
Some teams may test shortly before the event due to resource restraints - but that can create a lot of pressure during the actual event. Back Market, who indicated they experience about 5x the traffic on Black Friday, provided a candid review of their preparation approach and improvements from year to year.
Sometimes the holiday event itself becomes a “surprise” trial that has pitfalls in real time and a new set of best practices or site changes result. Lowe’s experienced downtime during Black Friday in 2018 and revamped their site infrastructure to scale, which ended up being great preparation for the pandemic.
Moving forward
Combining ongoing performance testing with layered defense development and modernization tailored to improved customer experience and traffic surges, a year-round focus promises smoother sales seasons ahead. Let’s take a quick look at each month that is detailed in the guide.
January - Explore personalization
Reviewing holiday personalization performance identifies gaps in recommendations aligned to purchase interests. Legacy algorithms mismatch anonymous high-value visitors, wasting opportunities to build loyalty. Updated approaches dynamically serve relevant content without being invasive, considering infrastructure load tradeoffs.
February- Scale visitor identification
Photo by Unsplash+ in collaboration with Ave Calvar
Inspecting traffic reports unveils peaks where unknown visitors overwhelm defenses despite offering sales potential. Smart fingerprinting securely recognizes site history and habits for tailored experiences boosting conversion versus excessive bot filtering. Controlled testing prevents performance degradation.
Managing rising visitor volumes across unknown users requires securely distinguishing high-intent traffic for conversion experiences while filtering excessive bots straining infrastructure. Success requires precision tracking that builds loyalty without latency tradeoffs. Prioritizing user journeys guides decisions maximizing targeting relevancy under peak loads.
March - Speed up product pages
Slow page loads directly curb revenue as conversion falters at final steps. Diagnosing checkout and key category landing speeds discovers common bottlenecks like images. Refining caching, tags, and prerendering targets the highest visibility pages first while monitoring platform overhead.
April - Evaluate search
Reviewing peak holiday traffic reveals search weaknesses impacting revenue. Slow queries inflate infrastructure load while outdated indexing algorithms miss commercial intent. By assessing abandonments tied to poor discovery and diagnosing backend bottlenecks, SREs can strategically improve relevancy, speed relevance, and scale efficiency pre-summer.
May - Tighten technical SEO fundamentals
With product teams chasing content gains, SREs must validate that their site infrastructure can serve fresh pages rapidly for indexing. Reviewing technical limitations around load speeds, proper crawling allowances, and index coverage sets foundations for driving organic readership and associated revenue lifts through better built architecture.
June - Optimize site-wide performance by platform
After improving product pages and SEO, holistically assess performance across devices using real user data. Prioritizing mobile page speed optimizations for CDN, caching, APIs and more adheres to technical hierarchy for maximum gains.
July - Pressure test and enhance bot defenses
Simulate expected peak holiday traffic conditions to reveal infrastructure delivery capacity and journey stability risks before visitor volumes overwhelm systems. Review bot defense gap analysis, comparing projected infiltration against mitigation capacities to determine where additional protections must be deployed to avoid “fake” traffic issues. Locking down infrastructure through bot defense improvements secures reliability. Getting ahead of scaling demands requires models guiding sufficient provisioning margins.
August - Plan out load and stress testing
It is time to gear up for load and stress testing with proactive traffic modeling, isolated test environments, and gradual load scaling to ensure peak demand readiness. This includes setting up user loads, real-world scenarios, and monitoring performance for production readiness during high-demand times.
September - Build out stress testing
Prepare for high-demand scenarios by building replicated test environments that mirror production databases, caching, third-party services, and user flows. Ensure scalability through configured rules and alerts for robust load testing while isolating the test environments. Use canary deployments as a lower cost testing option.
October - Blitz scale testing and break points
Increasing load testing durations with real-world traffic models surface infrastructure stability limits and automatic scaling policy defects far quicker for remediation. Pushing past projections stress-tests fallbacks before visitors ever experience disruptions even under extremes.
November - Execute final verification tests
Photo by Unsplash+ in collaboration with Lala Azizli
Focus on final verification tests early in the month to guarantee reliability under projected loads and optimize monitoring and alerting systems for peak season readiness, ensuring a seamless observability experience without disrupting live production.
December – Construct tactical reliability roadmaps
Grounded in traffic projections, load test learnings, and next-generation infrastructure appetite, tactical reliability roadmaps set data-driven priorities balancing visitor experience targets with operational scaling imperatives. Formalizing 12-month plans unifies teams against common delivery milestones.
Get ready for 2024 with the Cyber 5 guide!
Download the "A 12-Month SRE Guide to Cyber Five 2024" with monthly instructions on planning for the holiday season traffic and sales opportunities. Starting early lets you focus on key areas from personalization to traffic management and load testing your environment.
AI-powered services to improve performance and uptime even at Black Friday scale
Overhauling existing systems can take months - or even years to accomplish and may require significant resources that are not readily available. At the same time, customer expectations continue to increase and some changes may be necessary to stay ahead of the competition. With Macrometa, you can implement PhotonIQ services in 60 days or less without changing code or your existing systems.
- PhotonIQ Performance Proxy (P3): This intelligent caching proxy dramatically accelerates website speed - boosting HTML, CSS, and JS delivery by up to 300% - without any loss in quality.
- PhotonIQ Edge Side Tagging (EST): This innovative solution optimizes JavaScript by consolidating all tags into simplified edge-side code, significantly speeding up script loading and execution.
- PhotonIQ Virtual Waiting Rooms: This capability effectively manages sudden traffic spikes by queuing incoming requests, ensuring fast, reliable performance even when sites see dramatic surges in visitors.
- PhotonIQ Dynamic Prerendering: This service prerenders and serves full web pages at the edge to users and crawlers alike, slashing latency and boosting SEO.
- PhotonIQ Digital Fingerprinting: This privacy-first service enables more contextual site experiences by recognizing visitors across devices without requiring invasive tracking across sites or sign-ins.
To learn more about PhotonIQ services and how they can meet SREs’ unique goals, be sure to chat with an Enterprise Solutions Architect.
First photo by Unsplash+ in collaboration with Allison Saeng.