Data Engineer

Innodata Inc. (Taguig, Philippines) Follow 2 days ago

Full-time Analytics AI Data Ingestion Vector Data Engineering

We are looking for a Data Engineer to join our AI/LLM Delivery Unit, responsible for building scalable data pipelines and infrastructure that power AI and machine learning solutions.

This role plays a critical part in enabling LLM-based applications, data workflows, and AI model lifecycle management. The ideal candidate has strong experience in data engineering, cloud platforms, and pipeline automation, with exposure to AI/ML environments.

Key Responsibilities

1. Data Pipeline Development

Design, build, and maintain scalable data pipelines (ETL/ELT) for structured and unstructured data
Ensure reliable ingestion, transformation, and delivery of high-quality datasets
Optimize pipelines for performance, cost, and scalability

2. AI / LLM Data Infrastructure

Support data workflows for AI/ML and LLM systems, including training, fine-tuning, and evaluation datasets
Build data pipelines for:

o Text corpora and unstructured datasets

o Embeddings and vector databases

o Retrieval-Augmented Generation (RAG) systems

Enable efficient data access for Data Scientists and ML Engineers

3. Data Processing & Automation

Automate data extraction, transformation, and validation processes
Implement batch and real-time data processing solutions
Improve operational efficiency through data automation (aligned with process optimization use cases)

4. Data Quality & Governance

Implement data validation, monitoring, and quality checks
Ensure data integrity, consistency, and compliance with security standards
Maintain data documentation and lineage tracking

5. Collaboration & Delivery

Work closely with Data Scientists, ML Engineers, and Delivery teams
Translate business and AI requirements into scalable data architectures
Support end-to-end AI delivery lifecycle from data ingestion to deployment

Qualifications

Education

Bachelor’s degree in Computer Science, Data Engineering, Information Systems, or related field
Advanced degree is a plus

Experience

3–7+ years of experience in data engineering or related roles
Experience supporting AI/ML or analytics platforms
Exposure to AI/LLM-related data pipelines is a strong advantage

Technical Skills

Core Skills

· Strong programming skills in Python and/or Scala

· Expertise in SQL and database design

· Experience building ETL pipelines (Airflow, Dagster, or similar)

Data & Platform Skills

· Experience with:

o Data warehouses (Snowflake, BigQuery, Redshift)

o Distributed data processing (Spark)

o APIs and data integration

· Familiarity with streaming tools (Kafka, Kinesis) is a plus

AI/LLM-Related Skills

· Experience working with unstructured data pipelines (text, NLP datasets)

· Familiarity with:

o Vector databases (Pinecone, FAISS, Weaviate)

o Embeddings pipelines

o RAG architectures

Cloud & DevOps

· Hands-on experience with AWS, Azure, or GCP

· Knowledge of:

o Docker / containerization

o CI/CD pipelines

o Infrastructure-as-Code (Terraform is a plus)

---

Core Competencies

· Strong data modeling and system design skills

· Attention to detail and data quality

· Problem-solving and analytical thinking

· Effective communication with both technical and non-technical stakeholders

· Ability to work in fast-paced, delivery-oriented environments

---

Nice-to-Have

· Experience in AI/LLM or Generative AI projects

· Familiarity with annotation pipelines or data labeling workflows

· Exposure to MLOps frameworks

· Experience in high-scale or enterprise data environments

---

What Success Looks Like

· Builds robust, scalable data pipelines supporting AI/LLM projects

· Improves efficiency and reliability of data workflows

· Enables faster model development through high-quality datasets

· Supports successful delivery of client-facing AI solutions

Sign In
Create Account

Sign in

To continue your application

or continue with email

By continuing you agree to our Terms & Privacy Policy.

Similar jobs

Data Engineer I

RELX ( Manila ) 1 day ago

Microsoft Fabric Data Engineer - Philippines

eduCLaaS ( Manila ) 1 day ago

Data Engineer

Dashlabs.ai ( Quezon ) 3 days ago

Data Engineer with Machine Learning

DME Service Solutions ( Taguig ) 3 days ago

Data Engineer (Fixed- Term)

Nexperia ( Cabuyao ) 3 days ago

Data Engineer (Fixed-Term)

Nexperia ( Cabuyao ) 3 days ago

Data Engineer Associate - Finance and Supply Chain

World Vision ( Manila ) 3 days ago

Associate Data Engineer (Remote)

TTEC ( Pasay ) 2 days ago

Data Engineer (Remote)

TTEC ( Pasay ) 2 days ago

Manufacturing Data Analytics Engineer

AUMOVIO ( Laguna ) 17 hours ago

Data Engineer - Philippines

Thakral One ( Manila ) 2 days ago

Data Engineer | Databricks

Universal Robina Corporation ( Quezon ) 1 day ago

Data Engineer (Philippines)

Thakral One ( Manila ) 2 days ago

Data Engineer

LTM ( National Capital Region ) 1 day ago

Senior Data Engineer (Philippines)

Thakral One ( Quezon ) 2 days ago

Data Quality & Performance Engineer

Asian Terminals Inc. (ATI) ( Manila ) 1 day ago

Data Engineer

Risewave Consulting, Inc. ( National Capital Region ) 2 days ago

Data Engineer (Azure) - Philippines

Thakral One ( Taguig ) 2 days ago

Network Engineer ( Data Center )

Universal Access and Systems Solutions ( Pasig ) 2 days ago

Data Engineer (Databricks/Snowflakes) - Consultant - Engineering - PH PDC

Deloitte ( Taguig ) 1 day ago

Show more

Data Engineer jobs in Taguig

Data Engineer jobs

Innodata Inc. jobs in Taguig

Taguig jobs

Upload Your ResumeLet employers contact you directly

More from Innodata Inc.

Global Recruiter

Innodata Knowledge Services, Inc. 1 day ago

Data and Reporting Analyst

Innodata Knowledge Services, Inc. 1 day ago

HR Generalist (with L&D Background) Remote

Innodata Inc. 1 day ago

See all jobs at Innodata Inc.

hit counter

Data Engineer

Back to search page