DevOps Journey in AWS at Turkey’s leading retail company

Elman Badalov
6 min readJul 27, 2023

--

Executive Summary

The company is currently one of the leading retail companies in Turkey. This company directly employs more than ten thousand people and indirectly supports the same number of people. Operating in many countries of the world, the Company is proud of its more than one hundred stores both in Turkey and abroad. Also, the company is expanding its services in more than ten countries. It enables it to reach millions of customers in Turkey and abroad with its multi-channel business model, e-commerce platforms, wholesale network and retail operations. With its multi-channel and multi-brand structure, this company offers comprehensive services.

Customer Challenge

The company’s founding technical developer team requires a scalable and highly available infrastructure to ensure consistent quality of service, even during periods of peak usage. Simultaneously, this team sought a cost-effective environment that could be scaled to accommodate both the current stable version and best practices. Additionally, they requested a new environment that would offer enhanced security measures and eliminate vulnerabilities. Furthermore, they expressed the need to monitor application and system metrics to identify and address opportunities for service improvement. In pursuit of a more efficient solution, they made the decision to establish a new network environment on AWS and migrate the existing one from AWS to this new setup.

Partner Solution

Our primary goal in the relevant company was to transform the working environment into a highly scalable, reliable, and secure system. We successfully achieved this by implementing a comprehensive AWS solution that incorporates advanced technologies and services.

  1. Introduction: To begin, we analyzed the architecture of the existing environment in order to create a new environment that adheres to best practices and is cost-effective. We identified our dependencies based on this architecture. The next stage involved installation and application.
  2. AWS Elastic Kubernetes Service: In the initial phase, we replaced slower scaling Amazon EKS instances with a new Kubernetes service, managed by Amazon EKS, which allowed us to run containerized applications more efficiently using the power and flexibility of Kubernetes. This step improved autoscaling capabilities, enabling seamless adaptation to usage spikes and dips. In addition, by optimizing the (Auto Scalling Group) ASG and (Horizontal Pod Autoscalling) HPA structure on EKS, we have created a cluster that works more dynamically and creates Saving in accordance with our wishes. In order to be cost-effective, we designed the cluster to be turned off during unused hours and the nodes to be active again at the target hour. Here, we have updated the environments suitable for Best Practices to work in isolation from each other in terms of clusters. As a result, the Test and Prod environment work in isolation from each other. Consequently, we established a faster-scaling EKS cluster that supports a reliable high-availability structure.
  3. AWS Virtual Private Cloud: We utilized Amazon VPC to create a secure and isolated virtual network, as in the previous environment. We seperated all environments (Test and Prod) each other in VPC Network Platform. This allowed us to control inbound and outbound traffic according to predefined security policies, adding an extra layer of security. Additionally, we updated the WAF service to conduct security checks on the traffic, ensuring its reliability. For efficient traffic distribution and effective management of incoming application traffic, we employed AWS’ Application Load Balancer (ALB). By keeping the security policies up to date in ALBs, we prevented potential security vulnerabilities on the site-side.
  4. AWS Service Dependencies and Details: Following the dependencies list of our applications, we optimized the new structure using services such as Amazon OpenSearch Service, ElastiCache Redis, RDS, and Amazon MQ to achieve a more efficient, highly available, and cost-effective solution. We also checked the resource usage of the relevant services. Based on this, we have created an even more cost-effective environment by taking Reserved Instances on these services. We have also prepared a retrospective Rollback scenario by taking backup/snapshot on these services. In a possible case, we will run the Rollback scenario by returning to the relevant backups.
  5. AWS Relational Database Service: For database management, we implemented Amazon RDS, which allowed this company to seamlessly operate and scale a relational database in the cloud. This approach simplified tasks such as hardware provisioning, database setup, patching, and backups. By keeping the snapshot mechanism up-to-date on RDS, we have provided retroactive data access in a possible situation. At the same time, we were able to control all the details about the entire DB by activating Performance Insights for the DB clusters we use. This is a tool that is actively used by both the customer and us.
  6. AWS ElastiCache: To improve data caching and retrieval efficiency, we incorporated Amazon ElastiCache Redis into the new environment. This step increased application speed and responsiveness while reducing the load on it’s relational databases.
  7. Amazon MQ: On the this side, we actively use the Amazon MQ service. Here, we have achieved a dynamic, fast and cost-effective solution to dissolve instant works. We determined the best practices for this service and created a high available Queue environment.
  8. AWS Elastic Cloud Compute: We employed EC2 instances for common tools and leveraged the EKS ASG structure to easily upscale/downscale new machines on EC2.
  9. Continuous Integration & Continuous Delivery: To automate it’s application deployment process, we utilized Helm, a package manager for Kubernetes, along with Jenkins and Argo CD for continuous integration and continuous delivery (CI/CD). This combination enabled us to create an automatic CI/CD process triggered by a single event. Argo CD architecture simplified our work during the CD process and streamlined application updates, minimizing downtime and manual errors.
  10. Cost Optimization Best-practice Details: The improvements resulted in significant differences in cost and security. With an up-to-date EKS cluster in the supported version of AWS, we achieved faster upscale/downscale processes. We optimized resource usage, allowing us to handle more traffic using fewer EC2 instances. Additionally, we conducted optimization studies on services like RDS, OpenSearch Service, and ElastiCache Redis, which had application dependencies. As a result, we established a cost-effective structure utilizing the most recent, AWS-supported versions.
  11. Security Best-practice Details: Through our optimization efforts, we migrated to a more up-to-date version, implementing the best practice structure recommended by AWS. This transition resulted in a more modern infrastructure while minimizing security vulnerabilities. By utilizing AWS’s Security Services, we continually detect and address security vulnerabilities that commonly arise in the environment. Also, we have not neglected to leverage the services provided by AWS, such as IAM, AWS Inspector, AWS CloudTrail, AWS Web Application Firewall, and AWS Secrets Manager. By doing so, we have effectively enhanced the security posture of the services within the corresponding account, reaching the highest level.
  12. Additional Details: Finally, we integrated AWS’s native security tools and features to ensure that the this infrastructure meets the highest cloud security standards. This includes network firewalls, access control lists, IAM roles, and encryption services, providing a multi-layered security strategy.

Results and Benefits

In conclusion, by integrating AWS’s native security tools and features, we have ensured that the this infrastructure adheres to the highest cloud security standards. This comprehensive approach includes network firewalls, access control lists, IAM roles, and encryption services, providing a robust multi-layered security strategy.

The migration to the new network infrastructure on AWS has brought significant improvements to it’s performance, scalability, and reliability. They can now efficiently manage large and complex workloads with enhanced speed and effectiveness. Additionally, AWS’s array of management and automation tools enable it to streamline system management, leading to increased operational efficiency and reduced total cost of ownership (TCO).

Consequently, the transition to the new network structure on AWS has empowered this company to efficiently handle workloads and achieve their growth objectives. AWS has provided them with the scalability to expand business processes and services, ensure uninterrupted operations, and enhance customer satisfaction.

--

--