Key Responsibilities:
Model Development & Fine-Tuning:
- Fine-tune and adapt large language models and vision-language models for data extraction from unstructured and semi-structured sources.
- Orchestrate fine-tuning workflows using tools such as Google Vertex AI, OpenAI fine-tuning APIs, and Hugging Face.
- Automate model lifecycle management including training triggers, artifact versioning, promotion between environments, and rollback strategies.
- Implement CI/CD pipelines for LLMs, including automated testing, evaluation gates, and safe production releases.
Collaboration with AI Engineers
- Work closely with AI Engineers to take their prompting strategies and fine-tuning approaches and turn them into repeatable, scalable production workflows.
- Partner on prompt and model versioning strategies to ensure reproducibility and auditability.
- Translate experimental wins into robust, production-ready systems.
Evaluation & Data Quality:
- Design and implement evaluation frameworks to measure model performance, reliability, and downstream impact.
- Build regression testing pipelines to detect accuracy drops as data or models change.
- Create and maintain live dashboards tracking model accuracy, drift, latency, and cost.
- Establish alerting and quality thresholds to proactively catch performance degradation.
Knowledge Graph & Data Mapping:
- Map extracted entities and relationships into graph-based knowledge representations.
- Collaborate on schema design and entity resolution strategies to support scalable knowledge bases.
ML Ops & Production Systems:
- Build and maintain ML Ops pipelines, including model deployment, monitoring, versioning, and retraining.
- Maintain full lineage across datasets, prompts, model versions, and deployments.
- Support auditability and reproducibility requirements critical to financial workflows.
Cross-Functional Collaboration:
- Work closely with product managers, researchers, and engineers to translate business and domain requirements into effective AI solutions.
- Contribute to architectural discussions and technical decision-making.
Qualifications:
Experience:
- 5–8+ years of experience in ML Ops, platform engineering, or applied machine learning roles.
- Prior hands-on experience in MLOps is required, including deploying, monitoring, and maintaining ML models in production.
- Prior experience working with LLMs via APIs (e.g., OpenAI, Hugging Face, or similar).
Technical Skills:
- Strong proficiency in Python and modern LLM frameworks (e.g., Langgraph, PydanticAI, OpenAI API, Vertex AI).
- Hands-on experience fine-tuning LLMs and/or vision models in production settings.
- Practical experience with ML Ops, including deployment and monitoring of models.
- Solid understanding of model evaluation, data quality, and performance trade-offs.
- Experience working with knowledge graphs, graph databases, or entity resolution systems.
- Familiarity with multimodal models, document processing, or OCR pipelines.
- Prior experience in AI research, applied research, or high-growth startups.
Nice-to-Haves:
- Experience with structured output validation and extraction-style LLM tasks.
- Familiarity with RAG systems, prompt versioning, or adapter-based fine-tuning (LoRA).
- Experience operating ML systems in regulated or high-accuracy domains (finance, legal, healthcare).