- English
- Dutch
- German
Description
Urgently seeking a Site Reliabilty Engineer,
Initial 3 Month contract and the position will require weekly visits to site in London.
Site Reliability Engineers (SREs) are responsible for keeping all user-facing services and other DSX production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our environments.
As an SRE you will:
- Be on a PagerDuty rotation to respond to availability incidents and provide support for service engineers with customer incidents.
- Use your on-call shift to prevent incidents from happening.
- Run our infrastructure with Terraform and Kubernetes.
- Use monitoring and alerting to alert on symptoms not outages.
- Document every action so that your findings turn into repeatable actions (playbooks) and then into automation.
You may be a fit for this role if you:
- Think about systems, and particularly edge cases and failure modes.
- Know your way around Linux and the Unix Shell.
- Have strong programming skills--preferably Nodejs, but it could be Python, Go, .NET or even Ruby.
- Have an urge for delivering quickly and iterating fast.
- Have experience with Nginx, Docker, Kubernetes, Terraform.
- Have good experience with GitHub.
Michael Bailey International is acting as an Employment Business in relation to this vacancy.
Share Now