In this role you would be responsible for day to day maintenance of engineering systems. You would also often act as first line of support for internal applications while fixing bugs, developing and deploying small components of code.
The role of the Production Support Engineer is a technical role thats ensures the stability, integrity, and efficient operation of the platform / system and services.
High Impact production issues often require coordination between multiple Development, Operations and IT Support groups, so you get to experience a breadth of impact with various groups.
- Logging and keeping records of various issues to help the team prioritize fixes and automations, along with measuring the product quality.
- Documenting troubleshooting and problem resolution steps.
- Monitor alerting channels, analyze problems, diagnose and do occasional code fixes with low to medium complexity.
- Taking ownership of technical issues and working closely with developers to resolve more complicated problems.
- Work closely with product and developers to enhance the quality of existing products.
- Resolving escalated customer complaints without the need for team lead intervention.
- Address urgent issues quickly, work within and measure against customer SLAs.
- Write scripts (shell, python, ruby, php) and aggressively automate manual / repetitive tasks.
- Automate scripts / tasks for reporting and maintenance; and build anomaly detectors and alerting where ever applicable.
- Develop smaller complexity features/enhancements in existing products.
- Perform in-depth research and identify sources of production issues surrounding the application.
- Work closely with business in managing day to day issues, resolve user queries.
- Perform daily health checks of the application, job schedules and infrastructure supporting the application.
- Develop and facilitate monitoring systems to identify issues before they happen.
- Identify, develop and design features to solve pattern of problems to stabilize production systems.
- Create accurate DB queries that will identify affected data and rectify them.
- Build a deep understanding of the domain.
- Knowledge of Unix / Linux based systems.
- Experience working with MySQL and Redis and writing simple queries to get data for debugging issues.
- Being able to creatively come up with solutions for various problems and implement them.
- An in-depth understanding of the different products and ability to navigate through the code to debug and small fixes.
- Hands on with any of the scripting languages like Bash, Python, PHP, Ruby.
- Excellent analytical and logical thinking.
- Quick troubleshooting and diagnosing skills.
- Problem solving and debugging skills.
- Ability to join the dots around multiple events occurring concurrently and spot patterns.
- Good to have requirements
- Prior production support experience.
- Prior programming experience.
- Familiarity with Apache, Sumologic, Grafana, Prometheus, Elasticsearch.
- Experience in dealing with RESTful web services is a plus.
- Worked with queues and understands cron jobs.