Campus Pride Jobs

Mobile Campus Pride Logo

Job Information

Apple Software Engineer (Site Reliability) Operations Lead, Enterprise Systems in Austin, Texas

Software Engineer (Site Reliability) Operations Lead, Enterprise Systems

Austin,Texas,United States

Machine Learning and AI

Conversational Engineering develops next generation communications, AI, and NLP solutions to support Apple Customers. Our mission is to maintain a comprehensive and effective support, sales & payment experience for customers around the globe. Our conversational engineering platform is growing rapidly to support new channels and regions. We are looking for a hands-on site reliability engineer operations lead who is passionate about designing, developing, and deploying cutting edge operations solutions which will impact millions of customers! We are seeking an experienced and dynamic Site Reliability Engineer (SRE) Operations Lead to lead our efforts in maintaining the reliability, availability, and performance of our systems. The ideal candidate will possess a strong background in production monitoring, a deep understanding of development and operations, and a proven track record in managing large-scale production systems. The SRE Operations Lead will play a crucial role in leading incident management from detection to resolution, ensuring the seamless operation of our systems and infrastructure.

Key Qualifications

  • Proven experience as a Site Reliability Engineer or similar role, with a focus on operations management.

  • Demonstrated experience managing large-scale production outages and leading incident response.

  • Deep understanding of production monitoring systems, log analysis, and performance metrics.

  • Proficient in scripting languages (e.g., Python, Bash) and automation tools.

  • Strong leadership and communication skills with the ability to effectively collaborate with cross-functional teams.

  • Experience mentoring and coaching team members to enhance overall performance.

  • Strong analytical and problem-solving skills with a proactive approach to identifying and addressing potential issues.

  • Ability to thrive in a fast-paced, dynamic environment and adapt to evolving technologies and business needs.

Description

Incident Management: Lead and coordinate incident response activities, ensuring timely detection, escalation, and resolution of production issues. Collaborate with cross-functional teams to mitigate the impact of incidents and prevent recurrence. Production Monitoring: Design, implement, and maintain robust production monitoring systems to proactively identify potential issues before they impact users. Analyze monitoring data to identify trends, patterns, and areas for improvement in system reliability. Operations Leadership: Provide technical leadership to the SRE team, fostering a culture of continuous improvement and innovation. Collaborate with development teams to integrate reliability best practices into the software development lifecycle. Capacity Planning: Work closely with infrastructure and capacity planning teams to ensure scalability and performance of systems. Proactively identify and address potential capacity issues before they impact system performance. Documentation: Maintain comprehensive documentation of system architecture, configurations, and procedures to facilitate efficient incident response and knowledge sharing. Collaboration: Collaborate with cross-functional teams, including development, QA, and product management, to drive improvements in system reliability and performance. Post-Incident Analysis: Conduct thorough post-incident analyses to identify root causes, contributing factors, and implement preventive measures to avoid recurrence.

Education & Experience

Bachelor's degree in Computer Science, Information Technology, or a related field OR equivalent work experieince

Additional Requirements

  • Certifications (Optional):

  • Relevant certifications in SRE, DevOps, or related fields would be a plus.

  • Advanced degree preferred.

Apple Footer

Apple is an equal opportunity employer that is committed to inclusion and diversity. We take affirmative action to ensure equal opportunity for all applicants without regard to race,color,religion,sex,sexual orientation,gender identity,national origin,disability,Veteran status,or other legally protected characteristics. Learn more about your EEO rights as an applicant (Opens in a new window) .

Apple will not discriminate or retaliate against applicants who inquire about,disclose,or discuss their compensation or that of other applicants. United States Department of Labor. Learn more (Opens in a new window) .

Apple will consider for employment all qualified applicants with criminal histories in a manner consistent with applicable law. If you’re applying for a position in San Francisco,review the San Francisco Fair Chance Ordinance guidelines (opens in a new window) applicable in your area.

Apple participates in the E-Verify program in certain locations as required by law. Learn more about the E-Verify program (Opens in a new window) .

Apple is committed to working with and providing reasonable accommodation to applicants with physical and mental disabilities. Apple is a drug-free workplace. Reasonable Accommodation and Drug Free Workplace policy Learn more (Opens in a new window) .

DirectEmployers