We are looking for someone whose personal qualities will help develop a strong technical culture within the team
What you will do:
Solve complex tasks related to migrating existing applications and subsystems to GitOps.
Optimize and redesign the monitoring system of components to reduce operational costs and improve monitoring quality (reduce the number of false alarms and signals that do not require engineers' attention).
Conduct postmortems after incidents, analyze, and implement improvements.
Responsibilities:
Support Linux-based servers and ensure smooth operation of the service.
Maintain telephony services and investigate related problems .
Participate in shifts, solve problems and incidents on the highly loaded system with an SLA of 99.999%.
Interact with other teams within your area of responsibility.
Analyze the operation of services and infrastructure and participate in projects to improve the quality of systems.
Requirements:
Experience working with UNIX-like systems and using the CLI to troubleshoot and diagnose server problems.
Experience with highly loaded systems, knowledge of fault tolerance, service monitoring, and optimization techniques.
Ability to use Python/Golang/Shell to automate work and develop internal tools.
Would be a plus:
Deep knowledge and experience in one or more areas where you consider yourself an expert will be a plus. In your application, please specify your strengths in any format that is convenient for you.
Understanding of networking protocols and SIP functionality.Experience working with Kamailio, Apache Kafka, Nginx, and ZeroMQ technical solutions.
Experience with k8s (Kubernetes), AWS cloud (Amazon Web Services) and EKS, Terraform, Ansible.
Experience with various monitoring systems, such as Zabbix, TICK, Elastic Stack (ELK), and Grafana.
You will get experience in:
Working in a high-performance team in a strong IT company.
Debugging Java and C++ programs using industry-standard technologies like Apache Kafka, Zookeeper, Kamailio, Nginx, etc
Participating in the software development process, where the Systems Reliability Engineer team plays a vital role.
Maintaining a worldwide distributed system.
Accomplishing 5-nines Service Level Agreements and having fun.
We offer:
Well-coordinated professional team
Cutting edge technologies, interesting and challenging tasks, dynamic project, great opportunities for self-realization, professional and career growth
Additional Health and Life Insurance Package
Employee Assistance Program
25 vacation days
ReBenefit Platform Account.