Lead Database Engineer

Thermo Fisher Scientific (Manila, Philippines) Follow 9 hours ago

Full-time Client-focused Advanced SQL Data Lakes Data Processing Shell

Apply Now

Work Schedule

Other

Environmental Conditions

Office

Job Description

Summarized Purpose:

We are seeking a Lead Database Engineer to design, build, optimize, and support AWS-based data lake, data warehouse, and database platforms. This role will lead database architecture, performance tuning, data quality, lineage, source-to-target mapping, production support, and technical delivery across PostgreSQL, Redshift, Athena, DynamoDB, SQL Server, and related AWS services.

Education/Experience:

Bachelor's degree or equivalent experience and relevant formal academic/vocational qualification.
Previous roles showcasing 7+ years of database engineering, data architecture, AWS data platform, SQL development, performance tuning, and production support experience, or an equivalent blend of education, training, and experience.

Major Job Responsibilities:

Design, build, and maintain AWS-based data lake and warehouse architecture using S3, Redshift, DynamoDB, RDS, Athena, and related cloud data services.
Optimize PostgreSQL, Redshift, Athena, DynamoDB, SQL Server, and SQL-based workloads for performance, reliability, scalability, and cost.
Lead data integrations, ETL processes, database operations, and source-to-target mapping based on stakeholder and collaborator requirements.
Maintain and improve existing integrations while adding new EMR, clinical, operational, and enterprise data integrations.
Implement data quality frameworks, automated validation, reconciliation, monitoring, and alerting to ensure accurate warehouse and lakehouse loads.
Normalize EMR, claims, clinical, operational, and flat-file data into data warehouse, analytical, and downstream reporting structures.
Develop and maintain mapping tables, data dictionaries, source-to-target mappings, metadata, lineage, and technical documentation for analysts and downstream systems.
Link and transform patient activity, operational, and business data into standardized outputs for analytics and reporting.
Implement backup, recovery, replication, retention, and operational readiness requirements for production databases and data platforms.
Maintain database documentation describing data elements, transformations, lineage, interfaces, ownership, and usage patterns.
Develop new data architecture and database processes to improve performance using AWS analytical services and lakehouse architecture patterns.
Use Python, PySpark, SQL, and automation scripts to support data processing, validation, migration, and operational workflows.
Ensure database security and compliance with HIPAA, GDPR, access control, auditability, encryption, and data governance requirements.
Manage production operations including recurring reports, pipeline support, issue triage, change management, and release coordination.
Communicate with stakeholders, mentor engineers, perform code reviews, and maintain strong relationships with cross-functional collaborators.

Knowledge, Skills, and Abilities:

Strong understanding of data lake, data warehouse, and lakehouse architecture patterns on AWS.
Client-focused approach with strong interpersonal, documentation, communication, and technical leadership skills.
Ability to multitask, prioritize, manage production issues, and maintain attention to detail in complex data environments.
Strong logical, analytical thinking, root-cause analysis, and problem-solving capabilities.
Proficiency in Python, PySpark, SQL scripting, Shell scripting, and automation for database and data platform operations.
Expert-level SQL and relational database design including schema design, normalization, dimensional modeling, and query optimization.
Deep knowledge of PostgreSQL performance tuning, indexing, partitioning, stored procedures/functions, replication, backup, and recovery.
Strong hands-on experience with PostgreSQL, Redshift, SQL Server, Athena, DynamoDB, RDS, S3, and AWS data services.
Strong performance tuning experience across PostgreSQL, Redshift, Athena, DynamoDB, SQL Server, and SQL-based workloads.
Familiarity with server environments, database connectivity, data movement, security, governance, and compliance standards.
Experience with validated systems, healthcare data, EMR integrations, clinical trial data, and regulated production environments preferred.
Project management, Agile delivery, GitHub workflows, Jira tracking, documentation, code review, and leadership experience.

Must Have Skills:

Expert-level SQL expertise and relational database design experience.
Advanced SQL Server experience including database development, optimization, stored procedures, SSIS, SSRS, and operational support.
Strong hands-on PostgreSQL and Redshift experience including administration, tuning, data modeling, backup, recovery, and production operations.
Experience building AWS data lake, data warehouse, and cloud BI solutions using S3, Redshift, Athena, DynamoDB, RDS, and related services.
Experience with data architecture, scalable data processing frameworks, ETL patterns, lakehouse design, and production-grade data pipelines.
Data modeling, database design, source-to-target mapping, data lineage, data dictionary, and technical documentation experience.
Strong problem-solving, analytical, troubleshooting, production support, and incident management skills.
Excellent documentation, communication, stakeholder management, and cross-functional collaboration abilities.
Performance tuning and query optimization across PostgreSQL, Redshift, Athena, SQL Server, and large SQL workloads.
Leadership, mentoring, code review, database standards, and technical decision-making capabilities.
Jira, GitHub, Agile methodology, CI/CD awareness, release management, and delivery tracking experience.
Experience implementing data quality frameworks, pipeline monitoring, alerting, reconciliation, and production support processes.

Good to Have Skills:

Experience with MySQL, Oracle, Aurora PostgreSQL, RDS, and additional relational or NoSQL database platforms.
Scripting and automation skills using Python, Shell, SQL automation, or AWS SDKs.
Validated system experience and regulated SDLC documentation practices.
Familiarity with Tableau, Power BI, or reporting and BI consumption patterns is helpful.
Familiarity with Databricks, Snowflake, Glue, Lake Formation, Step Functions, Lambda, or modern lakehouse tooling.
Experience with machine learning, NLP, LLMs, AI-assisted documentation, mapping automation, vector databases, embeddings, or LLM-enabled data quality is an advantage.
Healthcare, Electronic Medical Records, claims data, clinical trial data, HIPAA, GDPR, and patient data domain experience.

Working Hours: