Data, automation and advanced analytics technologies are drastically transforming industrial manufacturers beyond point process automation to systemic, highly contextualized and data driven systems.
Corning is building the foundational digital infrastructure for these company-wide efforts, and are looking for passionate, hard-working, and talented staff-level software engineers that will develop and enhance the data ingestion pipelines that are crucial to these efforts.
The Data Engineer, Advanced Analytics platforms will work with our core platform development team as well as domain experts, application developers, controls engineers and data scientists. Their primary responsibility will be to develop reliable and instrumented data ingestion pipelines that land inbound data from multiple process and operational data stores throughout the company to on-premise and cloud-based data lakes. These pipelines will require data validation and data profiling automation along with version control and CI/CD to ensure ongoing resiliency and maintainability of the inbound data flows supporting our advanced analytics projects.
As a data engineer for our advanced analytics platforms, your main responsibilities will be:
Design, test, deploy and maintain production big-data ingestion pipelines using established frameworks, patterns of practice, agile software development and CI/CD practices, working closely with the Principal Software Engineer – Data Ingestion
Work with cross-organizational data source teams to define data ingestion requirements for structured, unstructured and semi-structured data, pilot their implementation, ensure the data source teams accept the resulting landed data as valid
Define and implement automated validation and profiling capabilities needed to ensure reliable data delivery, using agile software development and CI/CD practices
Work with data source teams, domain experts and data scientists to define data cleansing and data enrichment requirements for landed data
Implement data cleansing and enrichment code using established patterns of practice
Work with data source teams, domain experts and data scientists to validate landed, cleansed and enriched data, using agile software development and CI/CD practices, while ensuring that the final datasets are directly usable by them without additional processing effort
Actively participate in code reviews and technical information sharing with your team members and the broader software engineering community at Corning
Stay up to date with industry standards and technological advancements that will improve the quality, productivity and performance of your work.
Provide support in a DevOps environment to monitor tokens, jobs and overall system performance.
Bachelor's degree in computer science, engineering, mathematics, or a related technical discipline
5-7+ years of experience in big data engineering roles, developing and maintaining ETL and ELT pipelines for data warehousing, on-premise and cloud datalake environments
5-7+ years of demonstrated production programming proficiency in at least one modern JVM language such as Java, Scala or Kotlin, as well as an interpreted declarative programming language such as Python
3+ years of experience developing batch, micro-batch and streaming ingestion pipelines using high-level Apache Spark APIs (pySpark, SparkR, SparkSQL and Scala)
3+ years of production experience using SQL and DDL
2+ years DevOps experience with AWS platform services, including AWS S3 & EC2, Data Migration Services (DMS), RDS, EMR, RedShift, Lambda, DynamoDB, CloudWatch, CloudTrail
Strong, hands-on technical familiarity with Apache Spark architecture, S3, parquet and Delta Lake architecture, technologies and tools
Expert level proficiency with both traditional relational and polyglot persistence technologies
Expert level proficiency with agile software development & continuous integration + continuous deployment methodologies along with supporting tools such as Git (Gitlab), Jira, Terraform, New Relic
Strong, hands-on familiarity with notebook environments including JupyterHub
Proven success in communicating with users, other technical teams, and senior management to collect requirements, describe data modeling decisions and data engineering strategy
Prior full-stack app development experience (front-end, back-end, microservices)
Familiarity with the following tools and technology practices:
Oracle, Microsoft SQL Server, SSIS, SSRS
Established enterprise ETL and integration tools including Informatica, Mulesoft
Established opensource data integration and DAG tools including NiFi, Streamsets, Airflow
Data sources and integration solutions commonly used in manufacturing enterprises, including Pi Integrator, Maximo
Reporting and analysis tools including PowerBI, Tableau, SAS JMP
What sets us apart? Corning’s unwavering commitment to Diversity. Diversity is integral to Corning’s belief in the fundamental dignity of the individual – one of Corning’s seven Values. We are committed to providing an environment where all employees can thrive. This begins with an understanding that our global workforce consists of a rich mixture of diverse people. This diversity will continue to be a source of our strength as well as a competitive advantage.
If you have a passionate belief in the power of innovation to change the world; and if you are up to the challenge of working for a world-class organization that makes real, profitable advanced materials, then visit Corning’s website at www.corning.com
This position does not support immigration sponsorship.