SRE/DevOps Engineer for Trading Platform | Remote Position

Remotely
Full-time
Our fintech organization seeks an accomplished SRE/DevOps Engineer to optimize, maintain, and scale our sophisticated cloud-native trading platform. You'll harness cutting-edge technologies—AWS, Kubernetes, Terraform, and modern CI/CD pipelines—to ensure exceptional system reliability, security, and performance in a fast-paced trading environment. Key Responsibilities: - Monitor production trading systems comprehensively, diagnosing performance bottlenecks through log analysis, metric evaluation, and application behavior patterns. - Conduct thorough incident resolution with detailed root cause analysis, comprehensive documentation, and cross-functional team collaboration. - Design and implement robust monitoring solutions utilizing industry-leading observability tools (ELK, Grafana, Dynatrace). - Orchestrate build, release, and configuration management processes using infrastructure-as-code methodologies. - Architect, deploy, and maintain highly-available AWS cloud infrastructure with emphasis on scalability, security, and performance optimization. - Configure and manage Kubernetes clusters for containerized microservices architecture, ensuring proper resource allocation and scaling. - Administer development, QA, and production environments with consistent configuration management. - Automate infrastructure provisioning through Terraform and Terragrunt, applying best practices for reproducible environments. - Collaborate with development teams to create efficient CI/CD pipelines that enable rapid, reliable deployments. - Implement security best practices across cloud infrastructure and application delivery pipeline. Required Skills and Experience: - 5+ years of progressive experience in SRE, DevOps, or Cloud Infrastructure roles. - Advanced Linux/Unix system administration capabilities with performance tuning expertise. - Demonstrated proficiency with containerization technologies, particularly Docker and Kubernetes (version 1.26+). - Mastery of CI/CD tools including Jenkins, GitLab CI/CD, or GitHub Actions. - Expert-level understanding of infrastructure automation using Terraform, Terragrunt, Ansible, or equivalent IaC tools. - Comprehensive knowledge of AWS cloud services ecosystem (EC2, EKS, RDS, S3, CloudWatch, IAM). - Strong background implementing monitoring solutions with the ELK stack, Prometheus, Grafana, and Dynatrace - Proficient Git version control usage, including branching strategies and workflow optimization. - Solid SQL fundamentals and hands-on experience with PostgreSQL or similar relational databases - Working knowledge of message broker systems, particularly Apache Kafka. - Understanding of HTTP networking stack and web server architecture (NGINX). Nice to Have: - Experience supporting financial trading or high-frequency systems. - HashiCorp Vault implementation for secrets management and secure configuration. - Background with high-availability, low-latency architectures in regulated environments. - Expertise with distributed tracing solutions (Jaeger, Zipkin, OpenTelemetry). - Experience with log aggregation and analysis at enterprise scale. - Network security optimization and threat mitigation strategies. - Familiarity with Python or Go for automation scripting. - Previous work with real-time data processing systems and event-driven architectures. Why Join Our Team: Working with us means becoming part of an innovative fintech organization that values technical excellence and continuous improvement. You'll tackle complex infrastructure challenges in a high-stakes trading environment while leveraging modern cloud-native technologies. We offer competitive compensation, flexible remote work arrangements, and a collaborative culture focused on professional advancement. Your contributions will directly impact the reliability and performance of mission-critical financial systems used by traders worldwide—providing tangible results and significant professional growth opportunities.