Robust Multi-Human Pose Estimation
Description
I’m looking for a developing a research oriented model that can accurately recover every joint in crowded, static images where people overlap, hide behind objects, or appear only partially. The emphasis is on distribution-robust performance under severe occlusion. I am mainly looking for large models to be used e.g., Mamba / hybrid Transformer models for human pose estimation.
The approach I have in mind mixes large Transformer backbones with geometric priors think part-affinity refinements, kinematic graph constraints, or similar. Frameworks such as PyTorch, TensorFlow, Detectron2, or MMPose are all fine as long as the pipeline stays fully reproducible on a single GPU.
Deliverables • An occlusion-aware pose estimation model with source code and training scripts • Pre-trained weights plus a clear read-me on how to reproduce results end-to-end • A concise tech report detailing architecture choices, training schedule, metrics, and ablation • A full article drafted and structured result and comparison.
Acceptance criteria
- At least a acceptance % AP boost over my current baseline on a hidden test split containing 30 % occluded joints, crowedPose dataset can be used.
- Inference speed ≥ 8 FPS on an RTX 3090 (batch size = 1)
- 100 % open-source dependencies, no paid libraries or closed models
Any open source datasets, such as CrowedPose, COCO etc can be used. Budget: USD 250–750 Skills: Machine Learning (ML), Software Development, Image Processing, Open Source, Research and Development, Computer Vision, Deep Learning, Technical Documentation
Skills
Want AI to find more roles like this?
Upload your CV once. Get matched to relevant assignments automatically.