JOB: Systems Reliability Engineer At ElasticRun

1340
 

Qualification

B.E / B.Tech / MCA / M.Sc. / M. Tech

Experience

2 to 14 years

Key Technologies / Stack

  • Loadrunner, JMeter, Docker, Unix Administration, Ansible
  • Docker, Kubernetes, Devops (Jenkins, Gitlab, Infra Config)
  • Azure/GCP Cloud
  • Python, Javascript, Git
  • MySQL / MongoDB

Nice to have

  • Redis, elasticsearch, prometheus, grafana, kibana
  • Database tuning, administration
  • Security testing, VA/PT tools, OWASP

Roles and Responsibilities

As a System Reliability Engineer (SRE) you will be responsible for keeping inhouse and SaaS applications up and running as well as improving the automation, scalability, and performance of systems.

Responsibilities

  • Troubleshoot availability and performance issues, debug distributed applications via analysis of data such as logs, metrics and APM (application performance management)and perform front-line remediation
  • Communicate with management and customers regarding aberrant system’s behavior
  • Monitor and audit all aspects of a production application stack and create alerts and dashboards based on data received through monitoring
  • Influence software and architecture design based on system and architecture observations related to performance and reliability
  • Design, develop and maintain automation software, scripts, and tools
  • Analyze and remove bottlenecks in the development workflow
  • Able to manage and drive small team, prepare shift plans, and other team management activities

Behavioral skills

  • Excellent verbal and written communication skills
  • Strong problem-solving skills
  • Passion for technology as well as helping customers and team members
  • Comfortable in interacting with external customers and internal stakeholders, should have good interpersonal skills.
  • Strong leadership traits and self-learning attitude
  • Proactive to take up additional responsibilities

Technical Skills

  • Good programming skills in one of C/C++, Java, Javascript, Shell scripting, Python or Go, and an ability to pick up new ones.
  • Strong knowledge of Linux environment, its fundamentals, internals and administration
  • Strong knowledge of at least one widely adopted database platform (MySql, PostgreSQL or Elasticsearch a plus)
  • Develop automation tools and framework to automate operational tasks, deployment of code, applications, services and machines
  • Represent SRE in design reviews and work cross-functionally with Engineering teams on operational readiness
  • Knowledge of Docker, Kubernetes, and containers is plus
  • Knowledge of / experience in anyof any security standards like ISMS, PCI DSS

Location: Pune

Company: ElasticRun

APPLY HERE

SHARE YOUR THOUGHTS & COMMENTS

Please enter your comment!
Please enter your name here