Back to all vacancies

Senior Site Reliability Engineer (Telco)

Senior Site Reliability Engineer (Telco)

We are looking for someone whose personal qualities will help develop a strong technical culture within the team

What you will do: 

  • Solve complex tasks related to migrating existing applications and subsystems to GitOps.

  • Optimize and redesign the monitoring system of components to reduce operational costs and improve monitoring quality (reduce the number of false alarms and signals that do not require engineers' attention).

  • Conduct postmortems after incidents, analyze, and implement improvements.

Responsibilities:

  • Support Linux-based servers and ensure smooth operation of the service.

  • Maintain telephony services and investigate related problems .

  • Participate in shifts, solve problems and incidents on the highly loaded system with an SLA of 99.999%.

  • Interact with other teams within your area of responsibility.

  • Analyze the operation of services and infrastructure and participate in projects to improve the quality of systems.

Requirements:

  • Experience working with UNIX-like systems and using the CLI to troubleshoot and diagnose server problems.

  • Experience with highly loaded systems, knowledge of fault tolerance, service monitoring, and optimization techniques.

  • Ability to use Python/Golang/Shell to automate work and develop internal tools.

Would be a plus:

  • Deep knowledge and experience in one or more areas where you consider yourself an expert will be a plus. In your application, please specify your strengths in any format that is convenient for you.
    Understanding of networking protocols and SIP functionality.

  • Experience working with Kamailio, Apache Kafka, Nginx, and ZeroMQ technical solutions. 

  • Experience with k8s (Kubernetes), AWS cloud (Amazon Web Services) and EKS, Terraform, Ansible.

  • Experience with various monitoring systems, such as Zabbix, TICK, Elastic Stack (ELK), and Grafana.

You will get experience in:

  • Working in a high-performance team in a strong IT company.

  • Debugging Java and C++ programs using industry-standard technologies like Apache Kafka, Zookeeper, Kamailio, Nginx, etc

  • Participating in the software development process, where the Systems Reliability Engineer team plays a vital role.

  • Maintaining a worldwide distributed system.

  • Accomplishing 5-nines Service Level Agreements and having fun.

We offer:

  • Well-coordinated professional team

  • Cutting edge technologies, interesting and challenging tasks, dynamic project, great opportunities for self-realization, professional and career growth

  • Additional Health and Life Insurance Package

  • Employee Assistance Program

  • 25 vacation days

  • ReBenefit Platform Account.

Write to us at jobs@jettycloud.com or send a message to our recruiters

We use cookies to analyze data.

If you keep using this website, it means that you agree to accept our cookies.
In case you don't agree to do that, check your browser settings or leave jettycloud.com.