Job Summary
GrabOn is looking for a highly skilled Python Developer with deep expertise in large-scale web scraping, browser automation, and distributed systems . The ideal candidate will design and maintain high-resilience, fault-tolerant scraping infrastructure capable of operating reliably against modern anti-bot defenses.
You will work on distributed, cloud-native scraping pipelines , collaborate on agentic and
autonomous systems , and continuously optimize for scale, cost, and reliability .
Key Responsibilities
Scraping & Automation
Design, build, and maintain high-resilience web scraping systems at scale
Implement advanced Selenium / Playwright automation (headless, stealth, browser fingerprint control)
Handle anti-bot mechanisms , including:
IP rotation & proxy orchestration
CAPTCHA detection & mitigation
Browser fingerprinting and evasion strategies
Distributed Systems & Cloud
Architect and maintain distributed scraping pipelines using:
AWS Lambda
SQS
EC2 (auto-scaling worker fleets)
Build retry-safe, idempotent, and fault-tolerant pipelines
Ensure graceful handling of failures, throttling, and partial outages
Performance, Monitoring & Optimization
Monitor scraping throughput, failure rates, and infrastructure health
Debug production issues across distributed workers
Optimize AWS cost, latency, and system throughput
Agentic & Autonomous Systems
Collaborate on agent-based architectures for:
Autonomous task execution
Decision-making workflows
Self-healing data pipelines
Engineering Practices
Write production-grade Python code (clean, testable, maintainable)
Work comfortably in Linux server environments
Follow best practices for concurrency, rate-limiting, and queue-based systems
Mandatory Requirements (Non-Negotiable)
Strong Python expertise (production systems, not notebooks or scripts)
Deep hands-on experience in web scraping at scale
Expert-level Selenium and/or Playwright knowledge
Proven experience with AWS Lambda, SQS, and EC2
Strong understanding of:
Concurrency & parallelism
Queues and distributed workflows
Retries, backoff strategies, and rate limits
Comfortable managing and debugging Linux servers & cloud infrastructure