Software Engineer - Data Engineering
Clarivate
We are hiring a Software Engineer with strong AI, LLM, and Prompt Engineering capabilities to support the development of advanced analytics and healthcare intelligence products within the DIA (Disease Intelligence & Analytics) team. This is a hands‑on engineering role where you will work across Python, LLM frameworks, prompt optimization, Pyspark, AWS, and ETL systems to build AI-ready data and insight-generation platforms. You will collaborate with a 10–15 member cross-functional product team—including engineers, data scientists, analysts, QA, and product leaders—to build production-grade solutions that transform healthcare data into actionable intelligence.
About You - Experience, Skills & Qualifications
(0-2) years of experience with Python and/or Pyspark
Hands-on experience with LLMs (OpenAI, Claude, Llama, etc.) and prompt engineering techniques
Strong ETL development experience with automated data workflows
Experience integrating AI/LLM-based components into production systems
Knowledge of RAG pipelines, embeddings, vector stores, fine‑tuning, and evaluation frameworks
Experience designing scalable data pipelines using SQL Server, Postgres, or similar databases
Ability to optimize data systems for performance, reliability, and high-volume workloads
Exposure to AWS cloud services (S3, Lambda, RDS, ECS, Bedrock preferred)
Strong analytical, problem-solving, and communication skills
Passion for AI-driven engineering, automation, and innovation
What you will be doing in this role?
Build AI-enabled data pipelines optimized for analytics, ML insights, and automated decision support
Design and refine prompts, system instructions, and AI workflows to improve accuracy and output quality
Implement RAG workflows, model orchestration, and intelligent data retrieval systems
Develop cloud-based distributed data systems using AWS (S3, RDS, Lambda, IAM, Step Functions)
Build microservices and utilities that leverage LLMs for summarization, classification, pricing logic, and insight generation
Enable AI-ready data layers to support predictive and prescriptive analytics
Implement monitoring for pipeline performance, model drift, and AI reliability
Ensure data governance, lineage, security, and compliance across the product ecosystem
Products You will be developing
DIA develops AI-augmented Disease Intelligence platforms that turn raw healthcare data into insights. You will contribute to systems that enable automated reporting, prediction modeling, pricing analytics, strategic recommendations, and real-time intelligence. Your code will not just run workflows—it will make systems think, reason, and provide insights at scale.
About the Team
The DIA (Disease Intelligence) team consists of 10–15 highly collaborative members including developers, QA engineers, analysts, and product leaders. We build multiple healthcare products designed for advanced analytics, automated insights, and intelligent reporting.
Hours of Work
Full-time permanent role (45 hours/week) with a hybrid working model and standard business hours.
At Clarivate, we are committed to providing equal employment opportunities for all qualified persons with respect to hiring, compensation, promotion, training, and other terms, conditions, and privileges of employment. We comply with applicable laws and regulations governing non-discrimination in all locations.