[Remote] SRE Sr Engineer/Specialist

Note: The job is a remote job and is open to candidates in USA. Ford Motor Company is a global leader in the automotive industry, and they are seeking an SRE Sr Engineer/Specialist to develop and enhance their global monitoring and observability platform. The role involves blending AI with software engineering to ensure the uptime and scalability of critical cloud services, while also driving the adoption of monitoring capabilities.


Responsibilities

  • Write, configure, and deploy code that improves service reliability for existing or new systems; set standard for others with respect to code quality
  • Provide helpful and actionable feedback and review for code or production changes
  • Drive repair/optimization of complex systems with consideration towards a wide range of contributing factors
  • Lead debugging, troubleshooting, and analysis of service architecture and design
  • Participate in on-call rotation
  • Write documentation: design, system analysis, runbooks, playbooks. Provide design feedback and uplevel design skills of others
  • Implement and manage SRE monitoring applications using AI, Python, and Observability data
  • Develop tooling using Terraform and other IaC tools to ensure visibility and proactive issue detection across our platforms
  • Work within GCP infrastructure, optimizing performance, and cost, and scaling resources to meet demand
  • Collaborate with development teams to enhance system reliability and performance, applying a platform engineering mindset to system administration tasks
  • Develop and maintain AI-enhanced automated solutions for operational aspects such as on-call monitoring, performance tuning, and disaster recovery
  • Troubleshoot and resolve issues in our dev, test, and production environments
  • Participate in postmortem analysis and create preventative measures for future incidents
  • Implement and maintain security best practices across our infrastructure, ensuring compliance with industry standards and internal policies. Participate in security audits and vulnerability assessments
  • Participate in capacity planning and forecasting efforts to ensure our systems can handle future growth and demand. Analyze trends and make recommendations for resource allocation
  • Identify and address performance bottlenecks through code profiling, system analysis, and configuration tuning. Implement and monitor performance metrics to proactively identify and resolve issues
  • Develop, maintain, and test disaster recovery plans and procedures to ensure business continuity in the event of a major outage or disaster. Participate in regular disaster recovery exercises
  • Contribute to internal knowledge bases and documentation

Skills

  • Bachelor's degree in Computer Science, Engineering, Mathematics or equivalent work experience
  • 3+ years of experience as an SRE, DevOps Engineer, Software Engineer or similar role
  • Strong experience with Python development and desired familiarity with Terraform Provider development
  • Proficient with monitoring and observability tools
  • Proficient with cloud services, with a strong preference for Kubernetes and Google Cloud Platform (GCP) experience
  • Solid programming skills in Python, with a good understanding of software development best practices
  • Experience with relational and document databases
  • Ability to debug, optimize code, and automate routine tasks
  • Strong problem-solving skills and the ability to work under pressure in a fast-paced environment
  • Excellent verbal and written communication skills
  • Agentic AI and MCP development experience
  • Experience with Dynatrace SaaS

Benefits

  • Immediate medical, dental, and prescription drug coverage
  • Flexible family care, parental leave, new parent ramp-up programs, subsidized back-up child care and more
  • Vehicle discount program for employees and family members, and management leases
  • Tuition assistance
  • Established and active employee resource groups
  • Paid time off for individual and team community service
  • A generous schedule of paid holidays, including the week between Christmas and New Year’s Day
  • Paid time off and the option to purchase additional vacation time.

Company Overview

  • We don't just make history -- we make the future. It was founded in 1903, and is headquartered in Dearborn, Michigan, USA, with a workforce of 10001+ employees. Its website is

  • Back to blog

    Common Interview Questions And Answers

    1. HOW DO YOU PLAN YOUR DAY?

    This is what this question poses: When do you focus and start working seriously? What are the hours you work optimally? Are you a night owl? A morning bird? Remote teams can be made up of people working on different shifts and around the world, so you won't necessarily be stuck in the 9-5 schedule if it's not for you...

    2. HOW DO YOU USE THE DIFFERENT COMMUNICATION TOOLS IN DIFFERENT SITUATIONS?

    When you're working on a remote team, there's no way to chat in the hallway between meetings or catch up on the latest project during an office carpool. Therefore, virtual communication will be absolutely essential to get your work done...

    3. WHAT IS "WORKING REMOTE" REALLY FOR YOU?

    Many people want to work remotely because of the flexibility it allows. You can work anywhere and at any time of the day...

    4. WHAT DO YOU NEED IN YOUR PHYSICAL WORKSPACE TO SUCCEED IN YOUR WORK?

    With this question, companies are looking to see what equipment they may need to provide you with and to verify how aware you are of what remote working could mean for you physically and logistically...

    5. HOW DO YOU PROCESS INFORMATION?

    Several years ago, I was working in a team to plan a big event. My supervisor made us all work as a team before the big day. One of our activities has been to find out how each of us processes information...

    6. HOW DO YOU MANAGE THE CALENDAR AND THE PROGRAM? WHICH APPLICATIONS / SYSTEM DO YOU USE?

    Or you may receive even more specific questions, such as: What's on your calendar? Do you plan blocks of time to do certain types of work? Do you have an open calendar that everyone can see?...

    7. HOW DO YOU ORGANIZE FILES, LINKS, AND TABS ON YOUR COMPUTER?

    Just like your schedule, how you track files and other information is very important. After all, everything is digital!...

    8. HOW TO PRIORITIZE WORK?

    The day I watched Marie Forleo's film separating the important from the urgent, my life changed. Not all remote jobs start fast, but most of them are...

    9. HOW DO YOU PREPARE FOR A MEETING AND PREPARE A MEETING? WHAT DO YOU SEE HAPPENING DURING THE MEETING?

    Just as communication is essential when working remotely, so is organization. Because you won't have those opportunities in the elevator or a casual conversation in the lunchroom, you should take advantage of the little time you have in a video or phone conference...

    10. HOW DO YOU USE TECHNOLOGY ON A DAILY BASIS, IN YOUR WORK AND FOR YOUR PLEASURE?

    This is a great question because it shows your comfort level with technology, which is very important for a remote worker because you will be working with technology over time...