Pose Estimation
Overview
Given one or more lidar range images and associated camera images, predict 3D keypoints for pedestrians and cyclists in the scene, up to 25m from the ADV.
You may use Perception Dataset v1.4.2 or v2.0.0 for this Challenge. v2.0.0 contains the same key point data as in v1.4.2, but is in a modular format that enables selective download of the components you need. This Challenge uses the current training, validation, and test set splits.
Leaderboard
Note: the rankings displayed on this leaderboard may not accurately reflect the final rankings for this Challenge.
Submit
To submit your entry to the leaderboard, upload your file in the binary proto format specified by the LidarHumanKeypointsSubmission message. The uploaded file should have the extension ".binproto".
To be eligible to participate in the Challenge, each individual/all team members must read and agree to be bound by the WOD Challenges Official Rules.
You can only submit against the Test Set 3 times every 30 days. (Submissions that error out do not count against this total.)
Metrics
We provide a python library to compute and report on the result page a number of different metrics. Users may want to inspect different subsets of them by selecting a group of keypoints and thresholds (for OKS and PCK metrics). The primary metric for ranking submissions will be the Pose Estimation Metric (PEM) computed for all keypoint types, which is a sum of the Mean Per Joint Position Error (MPJPE) over visible matched keypoints and a penalty for unmatched keypoints. We compute the PEM on all pairs of predicted and ground truth instances, for which at least one predicted keypoint is within the ground truth box enlarged by 25cm on each side. The final object assignment is selected using the Hungarian method to minimize:
\begin{equation*} \textbf{PEM}(Y,\hat{Y}) = \frac{\sum_{i\in M}\left||y_{i} - \hat{y}_{i}\right||_2 + C|U|}{|M| + |U|} \end{equation*}
Where:
M
is a set of indices of matched keypoints,U
is a set of indices of unmatched keypoints (ground truth keypoints without matching predicted keypoints or predicted keypoints for unmatched objects);Sets \(Y= \left\{y_i\right\}_{i \in M}\) and \(\hat{Y} = \left\{\hat{y}_i\right\}_{i \in M}\) are ground truth and predicted 3D coordinates of keypoints;
C=0.25
is a constant penalty for an unmatched keypoint.
Rules Regarding Awards
Please see the Waymo Open Dataset Challenges 2023 Official Rules here.