Apply Now    

Linux Site Reliability Engineer (SRE) Lead

Req #: 170111746
Location: Jersey City, NJ, US
Job Category: Technology
Job Description:
JPMorgan Chase & Co. (NYSE: JPM) is a leading global financial services firm with assets of approximately $2.5 trillion and operations in more than 60 countries. The firm is a leader in investment banking, financial services for consumers, small business and commercial banking, financial transaction processing, asset management, and private equity.
 
Global Technology Infrastructure (GTI) is a worldwide organization charged with delivering a wide range of products and services – end user, compute, data, transport, instrumentation and facilities – partnering with all lines of business to provide high quality service delivery, exceptional project execution and financially disciplined approaches and processes in the most cost effective manner. The objective of GTI is to balance both business alignment and the centralized delivery of core products and services. GTI is designed to address the unique infrastructure needs of specific lines of business and the demand to leverage economies of scale across the firm.
 
JPMorgan Chase's Core Foundation Services (CFS) group, within Global Technology Infrastructure, designs and delivers critical and foundational platform solutions for all technology infrastructure systems across all lines of business. This includes the engineering design and delivery of platforms for Directory Services, Authentication and Privilege Management, Configuration Management, IPAM, Orchestration, Reference data, and more.
 
The CFS team is seeking a Site Reliability Engineer (SRE) Lead that combines software and systems engineering to build and run large-scale, massively distributed, fault tolerant system. Candidate will build creative engineering solutions to operations problems, including optimizing existing systems, building infrastructure and eliminating work through automation. Candidate will work with various cross-functional teams, and must be able to work in a global team setting and adapt to dynamic requirements. As SREs are responsible for the big picture of how our systems relate to each other, candidate will use a breadth of tools and approaches to solve a broad spectrum of problems
 
Key responsibilities:
  • Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement.
  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
  • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
  • Practice sustainable incident response and blameless postmortems.
  • Be part of a dynamic team that provides round-the-clock coverage to ensure service uptime and stability
  • Thrive in a team environment with strong interpersonal skills. Collaborate and build relationships with engineers, development teams, architects, operations partners, and business clients
  • Establish, and regularly update, multi-phase delivery roadmap
Experience: 
  • Experience in one or more of the following: Java, Python, Perl, C/C++ or Ruby. (7+ years)
  • Solid knowledge in Linux, Unix and/or Windows operating systems (7+ years)
  • Solid knowledge of SQL and scripting (7+ years)
  • Solid knowledge of configuration management tools (Puppet/Chef etc.) and monitoring tools (7+ years)
  • Knowledge of cloud and virtualization technology a plus.
  • Experience with algorithms, data structures, complexity analysis and software design.
  • Experience in Agile SDLC practices with strong focus on continuous integration and continuous delivery pipeline automation and tooling, DevOps, distributed version control system, and agile methodologies.
  • Interest in designing, analyzing and troubleshooting large-scale distributed systems.
  • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
  • Ability to debug and optimize code and automate routine tasks
  • Proven ability in developing relationships with stakeholders, communicating project/program status, and understanding detailed business requirements across multiple project initiatives
  • Excellent interpersonal and communication skills, including ability to negotiate compromise and demonstrate diplomacy in sensitive situations and to interact effectively with peers and management across diverse cultures
Apply Now    

Join our Talent Community

Not ready to apply? Leave your information with us and we will keep you up to date with new career opportunities.

Other Information

Apply Using LinkedIn

You can also apply using your LinkedIn® profile. It may save you some time because your information will be automatically transferred into our system. Just click on the LinkedIn logo when you get to the application screen and follow the directions.

Submit an Updated Résumé

During the application process, be sure you have an up-to-date copy of your Résumé, your cover letter and any other documentation you would like to submit.