Case Study
Document Identification
File Name | AWS Systems Manager Solution for Cost Optimization and Enhanced Security |
Client Name | NEXGEN TECHTRONICS |
Version | Version 1 |
Sensitivity Classification | Company Confidential |
Document Owner | Nitin Arora |
Preparation
Action | Name | Role ?Function | Date |
Prepared by: | Nitin | Cloud Engineer | 01st August 2024 |
Reviewed/Approved by: | Varun | Technical Lead | 02st August 2024 |
Release
| Date Released | Change Notice | Remarks |
0.01 | | | 01st Draft |
Contribution (C) and Distribution (D) list
Name | C/D | Organization | Title |
| C & D | | |
Client name – NEXGEN TECHTRONICS
Summary:
An enterprise-level company required a managed and automated solution for starting and stopping Windows EC2 instances using AWS Systems Manager (SSM). The objective was to improve operational efficiency and enhance security. We implemented AWS Systems Manager, CloudWatch, CloudTrail, GuardDuty, and EC2 to optimize operations and security. The solution reduced costs and enhanced security, offering insights into efficient resource use. Key lessons included leveraging AWS services for automation and threat detection.
Challenges:
High Operational Costs: The client's EC2 instances were running continuously, leading to unnecessary expenses outside business hours.
Security Concerns: The client needed a secure method for accessing their instances without compromising their security posture.
Workload Pattern : Combination of Windows and Linux servers
Approach
- 1.Understanding the Workload Pattern:
- Conduct a detailed analysis of the client's EC2 instance usage patterns.
- Identify business hours and non-business hours to determine optimal start and stop times.
- Classify instances based on their operating systems (Windows and Linux) to ensure compatibility with Systems Manager features.
- 2. Designing the Automation Workflow:
- Use AWS Systems Manager Maintenance Windows to schedule tasks that align with business hours.
- Develop AWS Systems Manager Automation documents to start and stop instances based on the predefined schedule.
- Implement Systems Manager Session Manager for secure, role-based access to instances.
3. Monitoring and Logging:
- Integrate Amazon CloudWatch for real-time monitoring of instance state changes and performance metrics
Utilize AWS CloudTrail for logging and auditing API calls made to stop and start instances and access sessions.
Implementation
- 1. Configuring AWS Systems Manager Maintenance Windows:
- Create Maintenance Windows to define the time slots for stopping and starting EC2 instances.
- Assign specific tasks to these windows, ensuring they run during non-business hours for stopping instances and just before business hours for starting instances.
- 2. Creating Automation Documents:
- Develop Automation documents to encapsulate the logic for starting and stopping EC2 instances.
- Test these documents to ensure they work seamlessly with both Windows and Linux servers.
- 3. Setting Up Session Manager:
- Enable Systems Manager Session Manager to allow secure, role-based access to EC2 instances.
- Configure IAM roles and policies to restrict access based on user roles, ensuring only authorized personnel can access the instances.
- 4. Monitoring and Logging Setup:
- Configure Amazon CloudWatch alarms to notify the client of any anomalies or issues during the start/stop processes.
- Ensure AWS CloudTrail is capturing all relevant API calls for compliance and auditing purposes.
Customer Acceptance Testing
- 1. Initial Testing Phase:
- Conduct initial testing during non-critical hours to validate the start/stop automation.
- Verify that instances start and stop as per the scheduled Maintenance Windows without manual intervention.
- 2. Security Validation:
- Test Session Manager to ensure secure access to instances, validating role-based access controls.
- Conduct security audits to verify that all access is logged in CloudTrail and that there are no unauthorized access attempts.
- 3. Performance Monitoring:
- Monitor the performance of instances during the testing phase using CloudWatch.
- Adjust the automation scripts if any performance issues or delays in instance state changes are observed.
4. User Acceptance Testing (UAT):
- Conduct a UAT session with the client to demonstrate the functionality and effectiveness of the solution.
- Gather feedback and make necessary adjustments based on client input.
- Ensure the client is comfortable with the maintenance window schedules, automation scripts, and security measures.
5. Final Deployment:
- Once UAT is successful, deploy the solution into the production environment.
- Provide the client with documentation on how to manage and adjust Maintenance Windows, Automation documents, and Session Manager settings.
- Schedule regular review meetings with the client to ensure ongoing satisfaction and to address any new requirements or issues.
Outcome
The solution delivered substantial benefits, including significant cost savings through the automation of EC2 instance management, which reduced manual effort and administrative overhead. Enhanced security was achieved by securely managing passwords with AWS Systems Manager Parameter Store and integrating continuous monitoring with Amazon CloudWatch and AWS CloudTrail, leading to improved threat detection and a stronger security posture. Additionally, the client gained better visibility and control over their AWS environment, with comprehensive insights and auditing capabilities, enabling more effective management and decision-making.
Conclusion:
By following this approach and implementation plan, the client's operational costs was optimized by reducing unnecessary EC2 usage outside business hours. Additionally, the security posture will be enhanced through secure, role-based access methods, ensuring a robust and compliant environment.