Tier-2 Site Reliability Engineer – Cisco ROBOT

at Bank of America Corporation
Published February 5, 2024
Location Richmond, VA
Category Default  
Job Type Full-time  

Description

Job Description:

About Us:

At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. Responsible Growth is how we run our company and how we deliver for our clients, teammates, communities, and shareholders every day.

One of the keys to driving Responsible Growth is being a great place to work for our teammates around the world. We're devoted to being a diverse and inclusive workplace for everyone. We hire individuals with a broad range of backgrounds and experiences and invest heavily in our teammates and their families by offering competitive benefits to support their physical, emotional, and financial well-being.

Bank of America believes both in the importance of working together and offering flexibility to our employees. We use a multi-faceted approach for flexibility, depending on the various roles in our organization.

Working at Bank of America will give you a great career with opportunities to learn, grow and make an impact, along with the power to make a difference. Join us!

Position Summary

Bank of America Network Services has a need to recruit a Software Site Reliability Engineer (SRE) to support production operations of network automation solutions. Cisco Robot is one such solution. The Cisco Robot solution is composed of 3 key Cisco automation tools, Network Service Orchestrator (NSO), Business Process Automation (BPA), Configuration Workflow Manager (CWM)

The technology areas of focus for the Level 2 Cisco Robot SRE include:

* Cisco CMSP, IBM Watson and Prometheus monitoring tools for solution component monitoring and event management

* Usage of BMC Remedy change and incident management system

* Strong familiarity with networking routing and switching protocols, Data Center knowledge and Access network solutions, understanding of networking security technologies.

* Strong working knowledge of the following Cisco software products: Network Service Orchestrator (NSO), Business Process Automation (BPA), Configuration Workflow Manager (CWM).

* Working knowledge of microservices based software architecture

* Working knowledge of Kubernetes and OpenShift

* Knowledge of Virtualization & Cloud (VMware, OpenStack) and database (MongoDB, Postgres) technologies

* Hands-on experience with Python programming language

* Knowledge of software integration SOAP/RESTful API

* Hands on experience with network & software configuration tools such as Ansible, Chef/Puppet

* Orchestration skillsets and a foundational knowledge of cloud computing, virtualization and storage solutions are desirable. The work is always in alignment to the current and approved Network Services Standards, Incident and Problem Management Policies & Procedures, governance and management policies set forth by the firm.

* This position will interface directly with internal stakeholders and external suppliers/providers, architecture, product engineering, product management, and business management. At times, the post holder might be required to interface with various levels of senior management. Strong communication and problem-solving skills are essential.

* The candidate must be able to work on their own and also contribute in team settings of various sizes and locations. Adherence and use of standards, product sets, templates, systems, and artifacts are important to the success of the individual, the department, and the firm at large. The ROBOT Support Engineer will be considered a subject matter expert in their field and is expected to stay current with various technologies, organizational goals, and industry trends to drive end to end value.

Primary Skill

* Virtualization

Required Skills:

* Ensure that Cisco Robot and other network automation production systems are operational in accordance with stated service objectives.

* Perform continuous monitoring and event management of Robot production systems

* Manage Robot incidents including solving problems, triaging complex incidents, and managing end-user incident-related communications

* Write operational playbooks to improve monitoring posture and resolve issues. Feed more complex requirements to the DevOps teams

* Support business continuity tabletop exercises

* Level 2 Escalation point for Operational Support of End User Access Network

* Cisco Wired/Wireless LAN, Palo Alto CloudGenix (SD-WAN)

* Technical areas of focus include but are not limited to end-user WAN, LAN, WLAN, SD-WAN, MPLS

* Proactive network reviews including

* Routine testing of disaster recovery scenarios, identification of vulnerabilities and opportunities for improvement in observability across the network stack

* Mentorship of Production Services Specialists and technical leadership within the team

* Work with senior team members to validate impacts and communicate to all stakeholder's technical status updates

* Participate in the documentation of application flows, upstream/downstream impacts during outages, the customer experience in failure scenarios, contacts for various support needs and ensures appropriate runbooks and wikis are up to date and available for use during triage

* Work ad-hoc reports and offline incidents at the direction of the senior team members or leadership

* Promote and enforce production governance during triage/testing and fix efforts, exercises judgment within defined procedures and practices to determine appropriate action

* Adhere to design standards and global design authority processes and procedures

* Assemble professional documents based on existing templates and ability to provide accurate work descriptions with assumptions, and caveats

Desired Skills:

* Foundational knowledge of routing and switching protocols

* Foundational knowledge of Industry Data Center and Enterprise access network solutions

* Foundational knowledge of Cisco Data Center Compute platforms, such as UCS Blade & Rack Servers

* Foundational knowledge on Cisco Data Center platforms including Cisco Nexus, Catalyst switches, ASR routers

* Broad understanding and/or experience with L2-L3 networking, data center, and security technology, sufficient enough to understand customer solutions, topologies, and interactions with higher networking layers.

* 3+ years of Experience with other network technologies

* WAN, MAN, LAN, Optical, Routing, Switching, Firewall, Proxy/Threat Prevention, DDI, Load Balancing, and AAA

* 2+ years of Cloud or SDN knowledge and experience

* Experience with SDN; Cisco ACI, VMware NSX, Arista CloudVision

* Experience with SDWAN, preferred if on CloudGenix

* Ability to solve network issues and isolate problems

* Understanding of Incident & Change Management process

* Network Automation/Orchestration skillsets in frameworks and toolsets, preferably Tail-f NCS / NSO

* Network Programmability skillsets in Software Defined Networking (SDN), REST APIs, NETCONF, YANG, JSON, and XML.

* Foundational knowledge of Cisco and Industry Cloud computing (i.e. Openstack, VMWare and AWS), Data Center, Virtualization, Storage and Networking solutions is desirable.

* Programming understanding in Python and Exposure to Micro services architecture.

* Basic administration of mongo-DB and/or Postgres

* Experience with container management

* Hands on experience with Linux operating system and scripting

Desired Skills:

* Foundational knowledge of Cisco and Industry Cloud computing (i.e. Openstack, VMWare and AWS), Data Center, Virtualization, Storage and Networking solutions is desirable.

* Experience in Networking-related disciplines within a design, implementation, or operations role

* Relevant Industry certifications in Network Technologies

* Experience working in an Agile environment

* Experience of working within Financial Services (Insurance, Banking, Investment banking)

Shift:

1st shift (United States of America)

Hours Per Week:

40

Only registered members can apply for jobs.