AIOZ AI / Nov 5th / 9 min read

Guide3D A Bi-planar X-ray Dataset for 3D Shape Reconstruction (Part 4)

Effectiveness Analysis and Ablation Studies

We evaluate our proposed dataset, Guide3D, through a structured experimental analysis, as follows: i) initially, we assess the dataset’s validity, focusing on reprojection errors and their distribution across the dataset to understand its accuracy; ii) we then explore the applicability of Guide3D in a 3D reconstruction task ; and iii) finally, we benchmark several segmentation algorithms to assess their performance on Guide3D, providing insights into the dataset’s utility.

Dataset Validation

Our analysis revealed a non-uniform distribution of reprojection errors across the dataset, with the highest variability and errors concentrated at the proximal end of the guidewire reconstructions. Figure 1 shows the reprojection error patterns for both Camera A and Camera B. For Camera A, mean errors increase from approximately 6 px to a peak of 20 px, with standard deviations rising from 5 px to 11 px, indicating growing inaccuracies and variability over time. Significant fluctuations around indices 25 to 27 highlight periods of particularly high error. For Camera B, mean errors exhibit an initial peak of 9 px at index 1, followed by fluctuations that decrease towards the end. The standard deviations for Camera B start high at 11 px and decrease over time, reflecting a pattern of high initial variability that stabilizes later. These patterns are consistent with the inherent flexibility of the guidewire, which can form complex shapes such as loops.

Figure 1. Guidewire Reconstruction Error Analysis: (Left) Illustrates the distribution of reprojection errors, noting higher variability and peak errors in the mid-sections and reduced errors at the extremities. (Right) Presents the results of reconstruction validation.

Furthermore, we conducted a validation procedure using CathSim, incorporating the aortic arch model described in next subsection and a guidewire of similar diameter and properties. For sampling, we employed the soft actor-critic (SAC) algorithm with segmented guidewires and kinematic data, producing realistic validation samples. Evaluation metrics included maximum Euclidean distance (MaxED) at 2.880 ± 0.640 mm, mean error in tip tracking (METE) at 1.527 ± 0.877 mm, and mean error related to the robot’s shape (MERS) at 0.001 ± 0.000. These results demonstrate the method’s precision.

Guidewire Prediction Results

We now demonstrate the capability of the introduced network and highlight the importance of the proposed dataset. We examine the network prediction in the following manner: 1) we first conduct an analysis between the predicted and reconstructed curve by employing piecewise metrics, and 2) we showcase the reprojection error.

Shape Prediction Errors: Table 1 presents the comparison of different metrics for shape prediction accuracy. We quantify the shape differences using the following metrics: 1) Maximum Euclidean Distance (MaxED), 2) Mean Error in Tip Tracking (METE), and 3) Mean Error in Robot Shape (MERS). For all the metrics, the shape of the guidewire, represented as a 3D curve $\mathbf{C}(u)$ , is sampled at equidistant $\Delta u$ intervals along the arclength parameter $u$ . Therefore, the metrics represent the pointwise discrepancies between the two shapes along the curve’s arclength.

The results indicate that the spherical representation consistently outperforms the Cartesian representation across all metrics. Specifically, the Maximum Euclidean Distance (MaxED) shows a lower error in the spherical representation (6.88 ± 5.23 mm) compared to the Cartesian representation (10.00 ± 4.64 mm). Similarly, the Mean Error in Tip Tracking (METE) is significantly lower in the spherical representation (3.28 ± 2.59 mm) than in the Cartesian representation (6.93 ± 3.94 mm). For the Mean Error in Robot Shape (MERS), the spherical representation also demonstrates a reduced error (4.54 ± 3.67 mm) compared to the Cartesian representation (5.33 ± 2.73 mm). Lastly, the Fréchet distance shows a smaller error for the spherical representation (6.70 ± 5.16 mm) compared to the Cartesian representation (8.95 ± 4.37 mm). These results highlight the advantage of using the spherical representation for more accurate shape prediction.

Table 1 Shape Comparison (mm).

Shape Comparison Visualization: Figure 2a showcases two 3D plots from different angles, comparing the ground truth guidewire shape to the predicted shape by the network. The network demonstrates its capability to accurately predict the guidewire shape, even in the presence of a loop and self-obstruction in the image. The predicted shape aligns closely with the actual configuration of the guidewire. Notably, the proximal end manifests a more substantial error relative to the nominal error seen at the distal end. Discrepancies from the authentic guidewire shape span from a mere 2 mm at the distal end to a noticeable 5 mm at the proximal end. Impressively, the network evidences its capability to accurately predict the guidewire’s shape using only consecutive singular plane images. Subsequently, the 3D points are reprojected onto the original images, as illustrated in Figure 2b.

Figure 2. The figure illustrates the reconstruction similarity of the guidewire when reprojected onto the images. It demonstrates the network’s capability to accurately predict the guidewire shape, even in the presence of noticeable angles, highlighting the robustness of the prediction model.

Segmentation Results

We demonstrate Guide3D’s potential to advance guidewire segmentation research by evaluating the performance of three state-of-the-art network architectures: UNet (learning rate: $1 \times 10^{-5}$ , 135 epochs), TransUnet (integrating ResNet50 and Vision Transformer (ViT-B-16), learning rate: 0.01, 199 epochs), and SwinUnet (Swin Transformer architecture, learning rate: 0.01, 299 epochs). Performance metrics included the Dice coefficient (DiceM), mean Intersection over Union (mIoU), and Jaccard index, detailed in Table 2. The results indicate that UNet achieved a DiceM of 92.25, mIoU of 36.60, and Jaccard index of 86.57. TransUnet outperformed with a DiceM of 95.06, mIoU of 41.20, and Jaccard index of 91.10. SwinUnet recorded a DiceM of 93.73, mIoU of 38.58, and Jaccard index of 88.55. These findings benchmark the dataset’s performance and suggest potential for future enhancements. Despite these promising results, the presence of loops and occlusions within the guidewire indicates that polyline prediction could significantly improve task utility.

Table 2 Segmentation Results.

Discussion and Conclussion

This paper introduces a new dataset, Guide3D, for segmentation and 3D reconstruction of flexible, curved endovascular tools. Extensive experiments demonstrate the dataset’s value; yet several limitations must be acknowledged. Firstly, our dataset lacks clinical real human data due to the complexity and regulatory challenges of acquiring such data. Our standardized platform, however, aims to enable further research, providing a stepping stone towards clinical practice.

Additionally, the dataset primarily focuses on synthetic and experimental scenarios, which may not fully capture the variability and unpredictability of real-world clinical environments. While this controlled setting aids initial algorithm development and benchmarking, further validation with clinical data is necessary to ensure the robustness and generalizability of the proposed methods.

Moreover, the guidewire’s flexibility and the presence of loops and occlusions present significant challenges for segmentation and reconstruction tasks. Our dataset includes these complexities to push the boundaries of current methodologies, but future work should explore more advanced techniques.

Our dataset accommodates both video and image-based approaches, providing a versatile resource to facilitate the translation of these technologies into clinical settings. Our objective is to bridge the disparity between research developments and clinical application by establishing a standardized framework for evaluating the efficacy of various methodologies. Our code and dataset will be made publicly available.