Devops Jobs
January 31, 2025 at 04:53 AM
*Oracle* is hiring for *DevOps/Site Reliability Engineer*
Experience Required: 7-10 years
Location: Noida
Responsibilities
Key Responsibilities:
Understand and Manage Support Solutions: Gain a comprehensive understanding of the end-to-end configuration, technical dependencies, and behavior of Oracle's Enterprise support services.
Maximize Service Availability: Strive to maximize service availability by enhancing the service during non-crisis periods and minimizing impact during crises. Focus on hardening the service to extend the time between service-impacting events.
Identify Hardening Opportunities: Identify and address opportunities to improve service reliability, including enhancing monitoring coverage and recognizing actionable events that require intervention.
Enhance SOPs: Develop and refine Standard Operating Procedures (SOPs) by creating documented responses to alerts. Automate these responses and integrate them with actionable events for streamlined incident management.
Drive Major Incident Response: Actively participate in Major Incident bridges during critical service-impacting events to lead and coordinate effective service mitigation efforts
Post-Mortems and Critical Repairs: Engage in Post Mortems and Critical Repair Items following service-impacting events to prevent recurrence and ensure continuous improvement
Monitor and Improve: Understand and communicate the scale, capacity, security, and performance attributes and requirements of the service stack. Continuously work on improving telemetry, automation, and overall service reliability.
Troubleshooting and Issue Resolution: Act as the ultimate escalation point for complex or critical issues, utilizing deep knowledge of service topology and dependencies to troubleshoot and define mitigations.
Automation and Orchestration: Demonstrate a strong understanding of automation and orchestration principles to improve service availability, reduce time to mitigate issues, and enhance development velocity.
Drive Continuous Improvement: Develop tools, drive down incident counts, reduce event severity, and minimize time to mitigate. Foster a “Site Up” culture and continuously review and enhance systems and methods to improve custo
Technological Analysis: Contribute to the analysis and enhancement of MOS applications and internal tools, identifying and implementing durable solutions to complex challenges
Collaborate with Development Teams: Partner with development teams to define and implement improvements in the support service architecture, ensuring that enhancements are aligned with overall goals.
Articulate Technical Characteristics: Clearly communicate the technical characteristics of services and technology areas, guiding development teams in engineering and integrating advanced capabilities.
Communication and Problem Solving: Employ excellent communication, technical analysis, and problem-solving skills to methodically address and resolve issues. Communicate clearly and professionally with internal stakeholders during high-priority situations, both in written and spoken forms.
Team Development: Support the training and development of junior team members, sharing knowledge and best practices to foster growth within the team.
Qualifications:
Educational Background: Bachelor’s degree in Computer Science, Information Technology, or a related field. Relevant work experience may be considered in place of a degree
Experience: Proven experience as a System Engineer, Software Engineer, or in a similar role, preferably with a focus on complex enterprise software solutions. Understanding of the Enterprise Cloud solutions and the ability to delve into complex services.
Communication Skills: Excellent communication skills, analytical thinking, problem-solving capabilities, and attention to detail.
Technical Skills: Proficiency in Linux-based systems, including administration, scripting, and troubleshooting.
Judgment and Independence: Ability to handle varied and complex tasks independently, demonstrating sound judgment in decision-making.
Monitoring and Performance: Knowledge of system monitoring tools, performance tuning, and capacity planning.
Problem-Solving: Strong problem-solving abilities with a proven track record of analyzing and resolving complex technical issues.
To apply , share resume at [email protected]
Feel free to share with friends, family and colleagues #bezigsawed