What you will be doing:
-
Design, develop, and maintain Azure DevOps pipelines and build processes
-
Bring Azure-specific skills, with proficiency in Azure CLI
-
Manage cloud-native tech stacks with Kubernetes and Container Orchestration
-
Drive monitoring and observability standards
-
Lead automation testing frameworks, integrated into the CI/CD process
-
Operate and maintain cloud-native tech stacks like Azure Kubernetes Service (AKS), KEDA and advanced proficiency with Helm for automating Kubernetes.
-
Experience with Azure Monitor, including setting up alerts, and utilize APM tooling such as DataDog and/or Dynatrace for observability and performance monitoring.
-
Expertise in Azure DevOps for managing pipelines and build processes.
-
Familiarity with data pipelines and the ability to troubleshoot them effectively.
-
Required experience with SonarQube and lint tools for code quality and security scanning.
-
Lead unit, integration, and load testing practices with tools like SonarQube and lint integrated into the CI/CD process.
-
Proficiency in Azure CLI and networking services like Azure Virtual Networks (VNets), Private Endpoints, DNS, Private DNS, and VMSS ScaleSets will be critical for managing our cloud infrastructure.
-
Maintains all server-side cloud and Security documentation to ensure compliance with the government’s cyber security policy.
-
Develops strategies to manage support package releases and patch management.
-
Monitors cloud infrastructure and VM system maintenance and provides technical recommendations for improvements.
What we are looking for:
-
5+ years of experience
-
Bachelor’s degree or comparable work experience
-
Close collaboration with DevSecOps teams to integrate security best practices into DevOps processes.
-
Proficiency in Python and Bash scripting for automation tasks and managing infrastructure.
-
Experience in supporting centralized management and automation systems using Ansible playbooks and ad-hoc commands.
-
Familiarity with data pipelines and the ability to troubleshoot them effectively
-
In-depth knowledge of Linux based operating systems and services from the command line.
-
Familiarity with industry best practices for Linux server security; ability to implement organizational security standards on the Linux platform
-
Experience with automated software update mechanisms such as yum, up2date and apt-get
-
A SRE (Site Reliability Engineering) mindset, to ensure a strong focus on reliability, scalability, and continuous improvement within the operation.
-
Thorough understanding and workings of incident handling protocols, personally identifiable information, Server network procedures and information handling
-
Thorough understanding and workings of cloud Server models: IaaS, PaaS, DaaS, SaaS
-
Experience migrating existing applications to cloud
-
Good consultative, communication, analytical and judgement skills
-
Experience with implementing and maintaining highly available mission critical environments, disaster recovery testing and creating technical documentation
-
Demonstrated strong verbal and written communication and interpersonal skills; attention to detail and accuracy; and time management and organizational skills