Build Resilient, High-Performance Infrastructure at Scale
Location: Remote (Pakistan, Egypt, Uzbekistan)
Job Type: Full-time
Company: SOUM
Work Week: Sunday – Thursday
Working Hours: 9:00 AM – 6:00 PM (Saudi Arabian Time Zone)
Experience Required: 2+ Years
About the Role
SOUM is seeking a Mid-Level DevOps Engineer passionate about building performant, stable, and resilient infrastructure at scale. This is a high-impact remote role where you’ll take ownership of architecting and maintaining mission-critical infrastructure that powers our applications across multiple environments.
You’ll work in a collaborative, fast-paced environment, coordinating with cross-functional teams to ensure our systems remain highly available, secure, and optimized. If you thrive on solving complex infrastructure challenges and have a commitment to excellence, this is your opportunity to make a significant impact.
What You’ll Do
Infrastructure Architecture & Scalability
- Take full responsibility for the scalability, stability, and availability of low-latency, mission-critical systems
- Design and implement high-availability infrastructure using AWS EKS (Elastic Kubernetes Service)
- Architect solutions for disaster recovery and business continuity
- Ensure infrastructure can handle growth and traffic spikes efficiently
- Implement secure and stable infrastructure following industry best practices
CI/CD & Deployment Automation
- Enhance and maintain CI/CD pipelines using GitHub Actions
- Implement and optimize GitOps workflows using ArgoCD
- Generate and maintain Helm manifests for Kubernetes deployments
- Automate deployment processes to reduce manual intervention and errors
- Ensure zero-downtime deployments and rollback capabilities
Infrastructure as Code (IaC)
- Develop and maintain Infrastructure as Code using Terraform
- Create reusable, modular infrastructure components
- Manage infrastructure across multiple environments (dev, staging, production)
- Version control all infrastructure changes with proper documentation
- Ensure infrastructure is reproducible and consistent across environments
Monitoring, Alerting & Incident Management
- Set up and maintain comprehensive monitoring using NewRelic, Prometheus, and Grafana
- Create meaningful alerts and dashboards for proactive issue detection
- Debug infrastructure issues and identify performance bottlenecks
- Take responsibility for 24×7 monitoring and resolving infrastructure tier incidents
- Investigate incidents, document root causes, and implement preventive measures
- Escalate complex issues to appropriate teams when necessary
Network & Security Management
- Manage VPNs, load balancers, and firewall configurations
- Implement security best practices and compliance requirements
- Secure production infrastructure against threats and vulnerabilities
- Conduct regular security audits and implement hardening measures
- Manage access controls and identity management
Cost Optimization & Continuous Improvement
- Drive continuous improvement initiatives in infrastructure and processes
- Optimize AWS costs through right-sizing, reserved instances, and efficient resource usage
- Identify opportunities for automation and efficiency gains
- Stay current with new AWS services and DevOps tools
- Share knowledge and best practices with the team
Required Qualifications
Technical Expertise
- Excellent understanding of AWS cloud services including:
- EC2, EKS, ECS, Lambda
- VPC, Route53, CloudFront
- RDS, S3, ElastiCache
- IAM, CloudWatch, CloudTrail
- Deep knowledge of containerization and orchestration:
- Kubernetes architecture and management
- Docker containerization
- Helm package management
- Strong hands-on experience building high-availability infrastructure using EKS on AWS
- Expert-level Terraform skills for Infrastructure as Code
- Strong understanding of GitOps principles and ArgoCD implementation
- Deep knowledge of Linux internals, networking, routing protocols, and system administration
- Expertise in securing production infrastructure
Required Experience
- Minimum 2 years of production DevOps experience
- At least 1 year working extensively with Terraform
- At least 1 year using monitoring tools (NewRelic, Prometheus, Grafana, or similar)
- Proven experience designing and architecting high-performing applications
- Background in Agile development practices and methodologies
- Experience managing mission-critical, 24×7 production environments
Core Competencies
- Strong problem-solving and debugging skills
- Ability to work independently and take ownership of infrastructure
- Excellent documentation and communication skills
- Proactive mindset with focus on automation and efficiency
- Strong team player with collaborative approach
- Commitment to quality and perfection in infrastructure design
- Ability to work effectively in remote, distributed teams
Preferred Qualifications (Nice to Have)
Certifications
- AWS Certified Solutions Architect (Associate or Professional)
- AWS Certified DevOps Engineer
- AWS Certified SysOps Administrator
- Certified Kubernetes Administrator (CKA)
- Certified Kubernetes Application Developer (CKAD)
Additional Experience
- Experience working in fast-growing startups or high-growth environments
- Background in fintech, e-commerce, or marketplace platforms
- Experience with additional tools and technologies:
- Service mesh (Istio, Linkerd)
- Configuration management (Ansible, Chef, Puppet)
- Log aggregation (ELK Stack, Splunk)
- Database management and optimization
- Serverless architectures
- Knowledge of compliance frameworks (SOC2, ISO 27001, PCI-DSS)
- Experience with multi-cloud or hybrid cloud environments
Our Tech Stack
Cloud & Infrastructure
Cloud Platform: AWS (primary)
Orchestration: Kubernetes (EKS), Docker
IaC: Terraform
GitOps: ArgoCD
CI/CD: GitHub Actions
Package Management: Helm
Monitoring & Observability
APM: NewRelic
Metrics: Prometheus, Grafana
Logging: CloudWatch, ELK Stack
Alerting: PagerDuty, OpsGenie
Security & Networking
VPN: OpenVPN, AWS VPN
Load Balancers: AWS ALB/NLB, NGINX
Firewall: AWS Security Groups, NACLs
Secrets Management: AWS Secrets Manager, HashiCorp Vault
Development Tools
Version Control: Git, GitHub
Collaboration: Slack, Jira
Documentation: Confluence, Notion
What We Offer
Work Flexibility
- 100% remote position – work from anywhere in Pakistan, Egypt, or Uzbekistan
- Flexible work environment with focus on deliverables
- Modern tech stack and cutting-edge tools
- Collaborative, distributed team culture
Professional Growth
- Work on challenging, large-scale infrastructure problems
- Exposure to modern DevOps practices and technologies
- Opportunities to shape technical direction and architecture
- Continuous learning and skill development
- Career growth path to Senior DevOps/Infrastructure Lead roles
Compensation & Benefits
- Competitive salary based on experience and expertise
- Performance-based bonuses and incentives
- Annual salary reviews
- Work with international teams and clients
Work Environment
- Fast-growing startup culture with room for innovation
- Supportive team environment with knowledge sharing
- Impact-driven work where your contributions matter
- Direct collaboration with product and engineering teams
Working Hours Details
Time Zone: Saudi Arabian Time (GMT+3)
Schedule: Sunday to Thursday (9:00 AM – 6:00 PM KSA)
On-Call: Rotational on-call schedule for critical incidents
Flexibility: Some flexibility for personal appointments with advance notice
Note: This role requires availability during Saudi Arabian business hours. Please ensure you’re comfortable with this schedule before applying.
Ideal Candidate Profile
You’re the perfect fit if you:
- Have a passion for building robust, scalable infrastructure
- Take pride in automation, monitoring, and operational excellence
- Enjoy solving complex technical challenges
- Can work independently with minimal supervision
- Are comfortable with on-call responsibilities and incident management
- Have strong ownership mentality and accountability
- Stay current with DevOps trends and emerging technologies
- Value documentation, knowledge sharing, and team collaboration
- Thrive in fast-paced, dynamic startup environments
- Are detail-oriented with commitment to security and best practices
A Day in the Life
Morning (Saudi Time):
- Review overnight alerts and system health dashboards
- Attend daily standup with engineering and product teams
- Prioritize infrastructure tasks and incidents
Mid-Day:
- Implement infrastructure changes using Terraform
- Review and merge infrastructure PRs
- Enhance CI/CD pipelines and deployment automation
- Debug and resolve infrastructure issues
Afternoon:
- Set up monitoring and alerting for new services
- Optimize AWS costs and resource utilization
- Collaborate with developers on infrastructure requirements
- Document changes and update runbooks
Ongoing:
- Respond to alerts and incidents as they occur
- Participate in incident post-mortems and preventive measures
- Continuous improvement of infrastructure and processes
Key Performance Indicators (KPIs)
Your success will be measured by:
- System Uptime: Maintaining 99.9%+ availability for production systems
- Incident Response: Mean time to detection (MTTD) and resolution (MTTR)
- Deployment Frequency: Enabling rapid, safe deployments
- Infrastructure Costs: Optimizing AWS spend while maintaining performance
- Automation: Reducing manual tasks through automation
- Documentation: Maintaining comprehensive, up-to-date documentation
- Security: Zero critical security incidents
Important Information
AI-Assisted Hiring Process
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
Data Privacy
Your application data will be handled in accordance with applicable data protection regulations. We are committed to protecting your privacy throughout the recruitment process.
How to Apply
Ready to build world-class infrastructure for a growing platform?
Please submit:
- Updated resume/CV highlighting relevant DevOps and AWS experience
- Brief cover letter addressing:
- Your most complex infrastructure challenge and how you solved it
- Your experience with the key technologies in our stack (AWS, Kubernetes, Terraform, ArgoCD)
- Why you’re interested in working with SOUM
- Confirmation of availability during Saudi Arabian working hours
- Links to:
- GitHub profile (if you have public infrastructure code)
- Any relevant certifications (AWS, Kubernetes, etc.)
- Technical blog posts or presentations (if available)
Bonus: Include examples of infrastructure you’ve designed, Terraform modules you’ve created, or monitoring dashboards you’ve built.
SOUM is an equal opportunity employer committed to building a diverse and inclusive team.
Note: Only candidates based in Pakistan, Egypt, or Uzbekistan will be considered. Candidates must be available during Saudi Arabian working hours (Sunday-Thursday, 9 AM – 6 PM KSA).
About SOUM
SOUM is a fast-growing technology company building innovative platforms that serve customers across the Middle East region. We’re committed to excellence, innovation, and creating a culture where talented engineers can do their best work while solving meaningful problems at scale.