We are looking for Site Reliability Engineers to join our new SRE team. As part of the team you will be taking responsibility for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of customer applications and infrastructure.
What your day-to-day will look like:
- Participate in your team's effort to continuously improve our customer's production environments
- Own your team's tech and tools stack and contribute to the relevant Open Source projects
- Design, analyse and troubleshoot large-scale distributed systems
- Being part of your team's on-call rotation
- Learn and share by being part of the Cloud Native community through blog post and conference talks
- Automate almost all the things
Skills and requirements:
- Strong engineering OR operations background and the urge to master both disciplines
- An analytical mind, debugging and problem solving skills
- Strong written and spoken technical communication
- Flexibility to learn about and work with different technical environments and teams
Bonus Points (we value curiosity and ability to learn over previous experience):
- Strong understanding of the Kubernetes API, core principles and components
- Strong knowledge of Linux networking and security related to containers
- Production experience with at least one common CI/CD system
- Production experience with at least one major cloud provider
- Production experience with at least one modern infrastructure automation or configuration management system
- Ability to contribute to polyglot code bases
We are building a remote first team across multiple time zones to allow a follow the sun on-call rotation.We are not hiring job descriptions. We hire humans. :)We welcome applications from everybody, regardless of ethnic or national origin, religion, gender identity, sexual orientation or age.