HPC Grid Reliability Engineer

MorganStanley (Bengaluru, KA, India) 1 day ago

Posting Date: Feb 19, 2021

Primary Location: Non-Japan Asia-India-Karnataka-Bengaluru

Bachelor%27s Degree

Job: Production Management and Operational Support

Full Time
Associate



About Us:





Morgan Stanley is a global financial services firm and has been mobilizing capital to help governments, corporations, institutions and individuals around the world to achieve their financial goals, for over 75 years now. Morgan Stanley is committed to maintaining first-class service and high standards of excellence that have defined the firm. At its foundation are five core values (putting clients first, doing the right thing, leading with exceptional ideas, commit to diversity and inclusion, and giving back) that guides its 55,000+ employees in over 1,200 offices across 43 countries.

Technology/Department/Role at Morgan Stanley:





Technology & Data is the Group Technology organization of the firm that engages world's top technology talent to drive its business and infrastructure. Firm's multi-billion dollar investment in technology enables the development and delivery of high quality, advanced, integrated solutions. Technology & Data operates with a vision - To provide highly reliable and commercial technology platforms, which support Firm strategy, delivered by an innovative, world-class team of professionals.


RPE - Reliability and Production Engineering - is strategic horizontal super-department within Tech & Data that supports the evolution of technology platforms and tools, manages and maintains the Firm's production plant, and ensures the implementation and adherence to proper operational controls to manage risk.


RPE includes a horizontal Plant Management (PLM), Tools and Engineering practice area that complements its direct production activities. This covers operational plant management, grid computing, platform engineering, capacity management and production tooling functions. PLM is a global practice area operating out of New York, London, Montreal, Toronto, Shanghai, Tokyo and Bengaluru.



PLM Grid Engineering & Operations group supports critical Enterprise Grid platforms that host Firm's computational workload, running an array of analytical and compute-intensive processing (pricing, market/credit risk, capital allocation/funding optimization, back testing, regulatory risk reporting, etc.) across Fixed Income, Commodities, Business Risk Management (BRM), and Equities. The team gets to work on diverse set of engineering problems across these Grid environments and heavily leverage IBM Symphony platform to manage these platforms.


RPE Plant Management Grid Engineering & Operations Group is looking for an experienced, innovative and proactive technology professional who can multi-hat as an operational support professional as well as assume Technical ownership to automate/ optimize various Grid engineering functions.

Key expectations from this role are:



-Demonstrate technical & operational acumen to deal with escalations from respective Application Support teams to troubleshoot/ resolve incidents as a Grid & Engineering subject matter expert in Asia Pacific and EMEA region

-Effectively communicate and coordinate with relevant development and support teams, work stream leads & stakeholders

-Assume Design, Development & Test accountability of assigned automation projects

-Be able to drive automated RFB (Ready for Business) checks for various multi-tenant Grids in order to timely escalate and respond to potential issues

-Support relevant Work stream Leads within Grid Engineering in identification and resolution of performance and capacity bottlenecks, risk and control gaps and infrastructural upgrades, as and when required

-Demonstrate understanding of Production Management methodologies i.e. Incident Management, Problem Management, and Event Management etc.

-Develop strong working relationships with Global PLM staff, Application Technology teams & QAPM Leads

-This role presents a unique opportunity to develop both Business & Technology perspectives:

-Exposure to understand how financial firms leverage grid technologies for various analytical processing supporting risk management/ back-testing, trading, & portfolio analytics, pricing of complex derivatives

-Exposure dealing with many business/application teams across all asset classes: High-Frequency Trading, Quants, Rates, Credit, Equities.

-Exposure to leading edge technology focusing on grid computing - grid computing is used across many financial firms to add large computational scale processing and big data.

-Opportunity to learn the leading product in this space - IBM Platform Symphony.


This role would support growth of Grid Engineering & Operations capability out of India PLM Tooling space. This role has a potential to influence onboarding of various HPC applications on multi-tenant environments and hence will get an exposure to diverse competencies and functions of Grid Computing.

Interaction with Business, Quants/ Strats, Development, QA and Production Management teams.

No specific business domain knowledge is required to start in this role.



* Relevant experience of 4-6 years in Grid Engineering & Operations, preferably in an IBM Spectrum Symphony environment

* Understanding and experience of service management using ITIL

* Experience in scripting languages (Unix shell scripting, Python, Perl, etc.)

* Linux/Unix Operating system fundamentals

* Knowledge of networking and distributed computing architecture

Desired Technical Skills:





* Exposure to cloud technologies

* Open source Analytics & Visualization tools (Splunk, etc.)

* Experience in Kx Systems kdb

Non-technical Skills:





* Drive to learn, self-starter

* Strong verbal and written communications

* Strong Analytical and Troubleshooting skills

* Ability to work independently or in a team

* Ability to pick up new technologies & innovate

HPC Grid Reliability Engineer

Apply On Company Site
Back to search page
;