JOB OVERVIEW
We are looking for a DevOps Engineer who will play a core role in building and operating a modern, scalable DevOps & Data Platform infrastructure that supports hundreds of microservices and data workloads across hybrid cloud (AWS + On-prem) environments.
You will:
● Develop and maintain CI/CD pipelines (GitLab CI, ArgoCD) across environments.
● Manage and scale Kubernetes clusters (EKS, RKE2) with data-oriented workloads (Airflow, NiFi, Kafka, Spark, Druid, OLTP/OLAP).
● Operate and optimize ETL/ELT data pipelines, ensuring performance, reliability, and automation.
● Collaborate closely with Data Engineering and BI teams to deliver a high-availability, secure, and observable Data Platform.
● Automate infrastructure provisioning using Terraform, Helm, Ansible, and manage secrets with Vault.
● Monitor infrastructure health using Prometheus, Grafana, Loki, ELK, and OpenTelemetry.
KEY RESPONSIBILITIES
🛠️ DevOps & Automation:
● Design and maintain standardized CI/CD pipelines (build, scan, test, multi-env deploy).
● Implement GitOps with ArgoCD for staging, UAT, and production environments.
● Write and maintain Helm charts, YAML templating, and deployment automation.
● Automate operations using scripting (Bash, Python), cronjobs, webhooks, and CLI tools.
☁️ Cloud & On-Prem Infrastructure:
● Manage Kubernetes clusters (EKS, RKE2), scale infrastructure using autoscalers (Karpenter, Cluster Autoscaler).
● Handle hybrid storage systems: Longhorn, EBS, Ceph, NFS, MinIO.
● Implement HA/DR strategies, backups/restores with tools like Velero, S3 versioning, snapshots.
● Operate across AWS and on-premises environments (bare-metal, VMware).
🧠 Big Data / Data Platform:
● Operate and tune Airflow, Apache NiFi, Kafka, Spark, Druid in production.
● Build and monitor ETL/ELT pipelines, syncing from OLTP systems (PostgreSQL, MongoDB) to OLAP systems (Redshift, ClickHouse, etc.).
● Ensure data integrity, latency management, and job failure troubleshooting.
🔐 Security & Observability:
● Implement IAM, RBAC, security groups, TLS, cert rotation, and secrets management (Vault).
● Monitor systems with Prometheus, Grafana, Alertmanager, Loki, and integrate with Slack/webhooks.
● Collaborate with the Security team to ensure CI/CD and data pipelines are secure and compliant.
REQUIREMENTS
● 3+ years of hands-on experience in DevOps, SRE, or Infrastructure Engineering.
● Proficient with Kubernetes (EKS, RKE2), GitLab CI/CD, Helm, Terraform.
● Proven experience with Airflow, Apache NiFi, Kafka, or equivalent in production.
● Strong understanding of ETL/ELT pipelines, OLTP/OLAP architecture
● Solid knowledge of AWS services: EKS, EC2, S3, IAM, RDS
● Strong scripting and Linux skills (Bash, Python), with deep understanding of systems and networking.
● Automation-first mindset, with a focus on scalability, DR, and efficiency.
- 13th salary, and review salary once/year
- Support lunch and parking free
- New Year gifts, birthday, 8/3, 20/10, ...
- Team building, company trip
- Annual health check-up
- Full insurance after 2 months of probation, PVI insurance for some levels
- Training in work and fast-track your career path
- Working in a friendly and professional environment