Back to Case Studies
Travel & Financial Services

From Crisis to Confidence: Transforming Travel Experience Delivery for 40M+ Users

We transformed a failing platform serving 40+ million active users from 33% change failure rates and critical outages to 99.4% availability and quarterly P1 incidents, enabling Collinson to deliver reliable travel experiences through comprehensive Route to Live transformation.

Published on: July 24, 1999Last Updated: July 24, 19999 min read

PISR: Problem, Impact, Solution, Result

  • Problem: Collinson Group, an FCA regulated travel experience organisation serving 40+ million active users worldwide, faced critical challenges with the reliability, quality and speed of their core products. Their waterfall-structured delivery approach, combined with significant system stability problems, was creating regular critical outages and severely impacting service quality and reputation.

  • Business Impact: This resulted in a 33% change failure rate, 97.05% uptime (well below industry standards), P1 incidents occurring every 2 days, and multiple critical outages affecting millions of users during both releases and normal operations. The existing delivery methodology was creating substantial reputational risk for a regulated organisation serving global travel customers.

  • Our Solution: Over 6 months, ClearRoute's team of 6 Quality Cloud Engineers embedded directly within Collinson's waterfall project teams with full transparency from incumbent suppliers. We implemented comprehensive Route to Live transformation including shift-left practices, containerisation, automated testing, Infrastructure as Code, and agile methodology adoption whilst maintaining PCI DSS compliance requirements.

  • Tangible Result: The transformation delivered a 94% reduction in change failure rate (33% to 2%), increased platform availability to 99.4%, reduced P1 incidents from every 2 days to quarterly occurrence, and achieved 1000% database performance improvement (3 seconds to 30 milliseconds per transaction). This directly enabled Collinson to provide reliable travel experiences whilst more than doubling release frequency.


The Challenge

Business & Client Context

  • Primary Business Goal: Restore platform reliability and service quality for 40+ million active travel experience users whilst maintaining FCA regulatory compliance and PCI DSS requirements.
  • Pressures: Reputational damage from regular outages, regulatory scrutiny due to system instability, competitive pressure in the travel industry, and urgent need to improve customer experience whilst serving millions of global users.
  • Technology Maturity: Legacy waterfall delivery methodology with multiple incumbent suppliers, system architecture suffering from regular critical outages, and delivery practices creating significant stability risks during both releases and normal operations.

Current State Assessment: Key Pain Points

  • Critical Reliability Issues: Platform experiencing regular critical outages during both release cycles and normal operational periods, with 97.05% uptime falling well below industry standards for a service supporting 40+ million active users globally.
  • Dangerous Change Management: 33% change failure rate meant one in three changes resulted in system problems, creating massive operational risk and customer impact for a travel services platform where reliability is essential.
  • Incident Management Crisis: P1 incidents occurring every 2 days indicated systemic quality problems, with change-related incidents creating regular service disruption affecting millions of users and damaging Collinson's reputation in the competitive travel market.
  • Delivery Methodology Constraints: Waterfall-structured project approach with multiple incumbent suppliers created coordination challenges, slow response to issues, and insufficient quality gates before production deployment.
  • Regulatory Compliance Challenges: As an FCA regulated organisation requiring PCI DSS compliance, existing delivery practices struggled to maintain security and compliance standards whilst addressing urgent stability issues.

Baseline Metrics (Where Available)

Metric CategoryBaselineNotes
Change Failure Rate33%One in three changes caused system problems
Platform Uptime97.05%Below industry standards for travel services
P1 Incident FrequencyEvery 2 daysCritical incidents affecting millions of users
Database Query Performance3 seconds per transactionSevere performance bottlenecks
Release MethodologyWaterfallMultiple supplier coordination challenges
Regulatory StatusAt RiskPCI DSS compliance under pressure

Solution Overview

Engagement Strategy & Phases

  • Phase 1: Embedded Review & Assessment: ClearRoute team embedded directly within existing waterfall project teams with full transparency from all incumbent suppliers. Conducted comprehensive Route to Live mapping during ongoing critical outages to understand systemic delivery and stability issues.
  • Time to First Value: Delivered initial stability improvements and shift-left implementation within first 8 weeks, demonstrating measurable reduction in change-related incidents.
  • Phase 2: Agile Transformation & Technical Foundation: Implemented organisational change from waterfall to agile methodology with appropriate team restructuring. Introduced containerisation, local development capabilities, and automated security/compliance checking to support shift-left practices.
  • Phase 3: Platform Engineering & Automation: Developed comprehensive test automation, Infrastructure as Code implementation, and automated deployment pipelines from development to release whilst maintaining PCI DSS compliance requirements.
  • Phase 4: Operational Excellence & Monitoring: Redesigned incident management, on-call systems, monitoring optimisation, and database performance improvements to ensure sustained reliability for 40+ million users.

Architectural Overview

Key Technical Transformations

Shift-Left Implementation with Regulatory Compliance

We created a comprehensive shift-left approach enabling engineering teams to deploy their own changes whilst maintaining FCA regulatory requirements and PCI DSS compliance:

  • Automated Security Gates: Integrated security, compliance and vulnerability checking directly into deployment pipelines
  • PCI DSS Compliance Automation: Developed delivery mechanisms working within QSA controls and regulatory requirements
  • Quality Assurance Embedding: Moved quality checks earlier in the development cycle to prevent production issues

Containerisation and Local Development Revolution

Eliminated environment dependencies that were creating deployment bottlenecks and testing challenges:

  • Docker and Docker Compose Integration: Enabled complete local development and testing environments
  • Environment Independence: Removed previous dependencies on shared development and test environments
  • Rapid Feedback Loops: Developers could test changes locally before committing to shared pipelines

Infrastructure as Code and Team Independence

Removed silos and dependencies that were slowing delivery and creating coordination challenges:

  • IaC Implementation: Teams could build, maintain and deploy application-specific infrastructure changes
  • Dependency Elimination: Removed key team dependencies to enable total ownership of changes
  • Automated Pipeline Creation: Full automation from development through to release deployment

Database Performance Transformation

Achieved 1000% performance improvement through systematic optimisation:

  • Query Optimisation: Reduced transaction processing from 3 seconds to 30 milliseconds
  • Database Architecture Review: Systematic performance analysis and improvement implementation
  • Capacity Planning: Ensured performance improvements could scale with 40+ million user base

QCE Disciplines Applied

  • Platform Engineering: Delivered comprehensive platform transformation enabling self-service deployment capabilities, Infrastructure as Code implementation, and automated pipeline creation from development to production. The containerisation approach with Docker Compose eliminated environment dependencies whilst maintaining PCI DSS compliance, enabling teams to own complete change lifecycle for improved reliability.

  • Quality Engineering: Implemented shift-left quality practices with automated security, compliance and vulnerability checking embedded directly into deployment pipelines. The comprehensive test automation framework and redesigned test environment infrastructure enabled proactive quality assurance, achieving 94% reduction in change failure rates through early detection and prevention.

  • Developer Experience: Transformed development workflow from environment-dependent waterfall processes to self-service local development with immediate feedback. The combination of containerisation, automated testing, and Infrastructure as Code eliminated manual dependencies and coordination bottlenecks, enabling engineering teams to deploy changes confidently whilst maintaining regulatory compliance.


The Results: Measurable & Stakeholder-Centric Impact

Headline Success Metrics

MetricBefore EngagementAfter EngagementImprovement
Change Failure Rate33%2%94% Reduction
Platform Uptime97.05%99.4%2.4% Absolute Improvement
P1 Incident FrequencyEvery 2 daysQuarterly95%+ Reduction
Change-Related IncidentsBaseline75% ReductionMajor Risk Mitigation
Release FrequencyBaselineMore than doubled100%+ Increase
Mean Time to ResolutionBaseline50% ReductionFaster Recovery
Database Query Performance3 seconds30 milliseconds1000% Improvement

Value Delivered by Stakeholder

  • For the CTO / IT Leadership:

    • Eliminated reputational risk from regular critical outages affecting 40+ million users through 99.4% availability achievement (risk_mitigation: "Platform stability for global user base")
    • Achieved regulatory compliance whilst accelerating delivery through automated PCI DSS checking and FCA-compliant deployment practices (compliance_acceleration: "Automated regulatory compliance")
    • Enabled strategic business growth through reliable platform foundation supporting travel experience innovation (business_enablement: "Platform ready for growth")
  • For Engineering Teams & Incumbent Suppliers:

    • Transformed working practices from coordination-heavy waterfall to autonomous agile delivery with embedded quality and compliance checking (methodology_transformation: "Waterfall to agile with compliance")
    • Eliminated environment dependencies through containerisation and local development, enabling rapid feature development and testing (development_acceleration: "Self-service development environment")
    • Provided comprehensive training and enablement ensuring incumbent teams could sustain improvements without ongoing consultancy dependency (capability_transfer: "Embedded skills and practices")
  • For Operations & Support Teams:

    • Reduced operational burden through 95%+ reduction in P1 incidents and 75% decrease in change-related incidents (operational_efficiency: "Dramatic incident reduction")
    • Improved incident resolution through redesigned on-call system routing to appropriate teams and enhanced monitoring capabilities (response_improvement: "50% faster resolution")
    • Established proactive monitoring and remediation capabilities preventing issues before user impact (preventive_operations: "Proactive issue prevention")
  • For 40+ Million Travel Experience Users:

    • Delivered reliable platform experience with 99.4% availability and 1000% performance improvement (user_experience: "Consistent, fast travel services")
    • Eliminated regular service disruptions that were previously occurring every 2 days through P1 incidents (service_reliability: "Uninterrupted travel experience")
    • Enabled faster feature delivery through doubled release frequency whilst maintaining stability (innovation_velocity: "Faster travel experience improvements")

Organisational Transformation Evidence

  • Agile Methodology Adoption: Successfully transitioned entire delivery organisation from waterfall to lightweight scrum process with quarterly planning and OKRs
  • Cultural Change Management: Embedded QCE practices within incumbent supplier teams through collaborative workshops and hands-on training
  • Regulatory Integration: Demonstrated that agile practices and regulatory compliance (FCA, PCI DSS) are complementary rather than conflicting
  • Sustained Improvement: Established foundation for ongoing platform evolution with expanded innovation platform programme

Lessons, Patterns & Future State

  • What Worked Well: The embedded approach with full transparency from incumbent suppliers proved essential for understanding and transforming entrenched delivery practices. Our enablement methodology successfully transferred capabilities to existing teams rather than creating dependency, ensuring sustainable improvement. The combination of organisational change (waterfall to agile) with technical transformation (containerisation, IaC, automation) addressed both process and technology barriers simultaneously.

  • Challenges Overcome: Working within strict regulatory constraints (FCA, PCI DSS) whilst implementing rapid transformation required careful balance of compliance and agility. The on-site engagement model was crucial for addressing cultural and process challenges alongside technical activities. Managing multiple incumbent suppliers during transformation required diplomatic coordination and clear value demonstration to maintain cooperation.

  • Key Takeaway for Similar Engagements: Platform reliability transformation in regulated environments requires simultaneous organisational and technical change - neither alone is sufficient. The embedded approach with existing supplier collaboration is more effective than replacement, particularly when regulatory continuity is essential. Database performance can often be the hidden bottleneck affecting user experience even when application layers appear functional.

  • Replicable Assets Created:

    • Shift-Left Framework for Regulated Environments: Automated security and compliance checking compatible with PCI DSS and FCA requirements
    • Containerised Local Development Pattern: Docker Compose setup eliminating environment dependencies for complex platforms
    • IaC Templates for Team Independence: Infrastructure as Code patterns enabling application teams to own complete change lifecycle
    • Agile Transformation Playbook: Methodology for transitioning waterfall organisations to agile delivery with regulatory compliance
    • Real-Time Route to Live Dashboard: IP developed for measuring SDLC gates and identifying blockers before impact
  • Client's Future State / Next Steps: With platform stability achieved and reliable delivery practices embedded, Collinson has expanded ClearRoute involvement into additional innovation platform programme. This next phase focuses on evolving their cloud platform to provide self-service capabilities and further developer experience improvements, building on the solid foundation of reliability and compliance established during the initial transformation.

Regulatory Environment Learnings

  • Compliance as Enabler: Automated compliance checking actually accelerated delivery by removing manual verification bottlenecks
  • Security Integration: Embedding security practices into pipelines improved both compliance and developer experience
  • Audit Trail Automation: Automated documentation and audit trails reduced regulatory overhead whilst improving visibility

Multi-Supplier Coordination Insights

  • Collaborative Transformation: Working with incumbent suppliers rather than replacing them enabled faster knowledge transfer and reduced resistance
  • Transparency Requirements: Full access and transparency from all suppliers was essential for understanding systemic issues
  • Enablement Over Replacement: Our approach of embedding capabilities within existing teams proved more sustainable than creating new dependencies

This engagement demonstrated that even crisis-level platform reliability issues can be transformed rapidly when organisational change management and technical excellence are applied together with appropriate regulatory consideration and stakeholder collaboration.