
Senior Site Reliability Engineer
- On-site
- Colombo, Western Province, Sri Lanka
- Site Reliability Engineering
Job description
Take ownership of customer issues reported and seeing problems through to resolution.
Research, diagnosing, troubleshooting, and identifying solutions to resolve system issues.
Investigating application faults, identifying network slowness.
Performing troubleshooting on all programs when required.
Follow standard procedures for proper escalation of unresolved issues to the appropriate internal teams.
Follow up with clients to ensure their IT systems are fully functional after troubleshooting.
Prioritize and manage several open issues at one time.
Document technical knowledge in the form of notes and manuals.
Job requirements
- Minimum of 4 years of experience in a similar role, preferably in a cloud-based environment.
- Strong proficiency in at least one programming language, such as Python, Java, or Go.
- Experience with containerization technologies like Docker and Kubernetes.
- In-depth knowledge of Linux operating systems and networking protocols.
- Proven track record in designing and implementing scalable and reliable infrastructure solutions.
- Familiarity with monitoring and alerting tools, such as Prometheus or Grafana.
- Ability to troubleshoot and resolve complex technical issues in a timely manner.
- Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams.
- Strong problem-solving and analytical thinking abilities.
- Experience with infrastructure-as-code tools like Terraform or CloudFormation is a plus.
Please note that only candidates who meet the above requirements will be considered for this position. We appreciate your understanding and encourage you to apply if you believe you are a suitable fit.
Thank you once again for your interest in Cloud Solutions International Pvt Ltd. We look forward to reviewing your application.
or
All done!
Your application has been successfully submitted!