Work Schedule
Other
Environmental Conditions
Office
Job Description
Summarized Purpose:
We are seeking a Lead Database Engineer to design, build, optimize, and support AWS-based data lake, data warehouse, and database platforms. This role will lead database architecture, performance tuning, data quality, lineage, source-to-target mapping, production support, and technical delivery across PostgreSQL, Redshift, Athena, DynamoDB, SQL Server, and related AWS services.
Education/Experience:
- Bachelor's degree or equivalent experience and relevant formal academic/vocational qualification.
- Previous roles showcasing 7+ years of database engineering, data architecture, AWS data platform, SQL development, performance tuning, and production support experience, or an equivalent blend of education, training, and experience.
Major Job Responsibilities:
- Design, build, and maintain AWS-based data lake and warehouse architecture using S3, Redshift, DynamoDB, RDS, Athena, and related cloud data services.
- Optimize PostgreSQL, Redshift, Athena, DynamoDB, SQL Server, and SQL-based workloads for performance, reliability, scalability, and cost.
- Lead data integrations, ETL processes, database operations, and source-to-target mapping based on stakeholder and collaborator requirements.
- Maintain and improve existing integrations while adding new EMR, clinical, operational, and enterprise data integrations.
- Implement data quality frameworks, automated validation, reconciliation, monitoring, and alerting to ensure accurate warehouse and lakehouse loads.
- Normalize EMR, claims, clinical, operational, and flat-file data into data warehouse, analytical, and downstream reporting structures.
- Develop and maintain mapping tables, data dictionaries, source-to-target mappings, metadata, lineage, and technical documentation for analysts and downstream systems.
- Link and transform patient activity, operational, and business data into standardized outputs for analytics and reporting.
- Implement backup, recovery, replication, retention, and operational readiness requirements for production databases and data platforms.
- Maintain database documentation describing data elements, transformations, lineage, interfaces, ownership, and usage patterns.
- Develop new data architecture and database processes to improve performance using AWS analytical services and lakehouse architecture patterns.
- Use Python, PySpark, SQL, and automation scripts to support data processing, validation, migration, and operational workflows.
- Ensure database security and compliance with HIPAA, GDPR, access control, auditability, encryption, and data governance requirements.
- Manage production operations including recurring reports, pipeline support, issue triage, change management, and release coordination.
- Communicate with stakeholders, mentor engineers, perform code reviews, and maintain strong relationships with cross-functional collaborators.
Knowledge, Skills, and Abilities:
- Strong understanding of data lake, data warehouse, and lakehouse architecture patterns on AWS.
- Client-focused approach with strong interpersonal, documentation, communication, and technical leadership skills.
- Ability to multitask, prioritize, manage production issues, and maintain attention to detail in complex data environments.
- Strong logical, analytical thinking, root-cause analysis, and problem-solving capabilities.
- Proficiency in Python, PySpark, SQL scripting, Shell scripting, and automation for database and data platform operations.
- Expert-level SQL and relational database design including schema design, normalization, dimensional modeling, and query optimization.
- Deep knowledge of PostgreSQL performance tuning, indexing, partitioning, stored procedures/functions, replication, backup, and recovery.
- Strong hands-on experience with PostgreSQL, Redshift, SQL Server, Athena, DynamoDB, RDS, S3, and AWS data services.
- Strong performance tuning experience across PostgreSQL, Redshift, Athena, DynamoDB, SQL Server, and SQL-based workloads.
- Familiarity with server environments, database connectivity, data movement, security, governance, and compliance standards.
- Experience with validated systems, healthcare data, EMR integrations, clinical trial data, and regulated production environments preferred.
- Project management, Agile delivery, GitHub workflows, Jira tracking, documentation, code review, and leadership experience.
Must Have Skills:
- Expert-level SQL expertise and relational database design experience.
- Advanced SQL Server experience including database development, optimization, stored procedures, SSIS, SSRS, and operational support.
- Strong hands-on PostgreSQL and Redshift experience including administration, tuning, data modeling, backup, recovery, and production operations.
- Experience building AWS data lake, data warehouse, and cloud BI solutions using S3, Redshift, Athena, DynamoDB, RDS, and related services.
- Experience with data architecture, scalable data processing frameworks, ETL patterns, lakehouse design, and production-grade data pipelines.
- Data modeling, database design, source-to-target mapping, data lineage, data dictionary, and technical documentation experience.
- Strong problem-solving, analytical, troubleshooting, production support, and incident management skills.
- Excellent documentation, communication, stakeholder management, and cross-functional collaboration abilities.
- Performance tuning and query optimization across PostgreSQL, Redshift, Athena, SQL Server, and large SQL workloads.
- Leadership, mentoring, code review, database standards, and technical decision-making capabilities.
- Jira, GitHub, Agile methodology, CI/CD awareness, release management, and delivery tracking experience.
- Experience implementing data quality frameworks, pipeline monitoring, alerting, reconciliation, and production support processes.
Good to Have Skills:
- Experience with MySQL, Oracle, Aurora PostgreSQL, RDS, and additional relational or NoSQL database platforms.
- Scripting and automation skills using Python, Shell, SQL automation, or AWS SDKs.
- Validated system experience and regulated SDLC documentation practices.
- Familiarity with Tableau, Power BI, or reporting and BI consumption patterns is helpful.
- Familiarity with Databricks, Snowflake, Glue, Lake Formation, Step Functions, Lambda, or modern lakehouse tooling.
- Experience with machine learning, NLP, LLMs, AI-assisted documentation, mapping automation, vector databases, embeddings, or LLM-enabled data quality is an advantage.
- Healthcare, Electronic Medical Records, claims data, clinical trial data, HIPAA, GDPR, and patient data domain experience.
Working Hours:
- India: 05:30 PM to 02:30 AM IST
- Philippines: 08:00 PM to 05:00 AM PHT