Archives

  • 2022-08
  • 2022-07
  • 2022-06
  • 2022-05
  • 2022-04
  • 2022-02
  • 2021-03
  • 2020-08
  • 2020-07
  • 2020-03
  • 2019-11
  • 2019-10
  • 2019-09
  • 2019-08
  • 2019-07
  • br P was defined as a tumor invasion

    2022-05-23


    P0 was defined as a tumor invasion depth restricted to the M or SM1, and P1 was defined as an invasion depth deeper than the SM1. If gastric cancer was diagnosed as P0 before treatment, it was deemed suitable for endo-scopic resection if other clinical parameters met the guide-lines. By contrast, a P1 lesion was not treated by endoscopic treatment regardless of other parameters.
    Training algorithm
    Our artificial intelligence–based CNN-CAD system was developed through transfer learning leveraging a state-of-the-art CNN architecture, ResNet50, which was pretrained on the ImageNet database containing over 14 million images.
    Applying a CNN-CAD system to determine invasion depth for endoscopic resection Zhu et al
    Figure 1. Representative endoscopic images of P0 and P1 from the development dataset. A, P0 was defined as a tumor invasion depth restricted to the M or SM1. B, P1 was defined as a tumor invasion depth deeper than the SM1. M, Mucosa; SM1, submucosa.
    The architecture of our CNN-CAD system is shown in Figure 2. A pretrained ResNet50 extracted 2048 features from each input image. All weights in ResNet50 were fixed during training to prevent overfitting of the data because of the relatively small size of our training dataset compared with the total number of model weights. Extracted features were then input to train a 2-layer fully connected neuron network for final classification to P0 or P1. The fully con-nected network was optimized by an Adam optimizer with a learning rate of .01. L2 regularization was applied to both layers with a factor of .001 in the first layer and .0003 in the second layer. Hyperparameters were fine-tuned based on validation results. The softmax function was used to output classification results from the fully connected network as 2 continuous values ranging from 0 to 1, indicating the proba-bility of the input being classified as P0 or P1.
    Each input was a 299 299 NPS-2143 image with 3 di-mensions corresponding to red (R), green (G), and blue
    (B) colors digitized to a 299 299 3 tensor. The input was gradually transformed by ResNet50 through 5 stages with a total of 50 rectified linear unit activations to provide nonlinearity, which is essential for CNNs. Between stages 2 and 5, residual block structure was introduced to over-come the issues of vanishing and exploding gradients, which are notorious problems for deep CNN architectures. After 5 stages, the input was transformed to a 10 10 2048 tensor. Both height and width were greatly reduced or summarized as dimensions increased from 3 to 2048, indicating Temperature-sensitive mutation much more information was extracted beyond the original RGB pixel information. The tensor was then flattened to a 2048-element vector based on the 2048 features extracted by ResNet50.
    The features extracted by ResNet50 in each stage are shown in Figure 3. In stage 1 the image was converted to a 150 150 64 tensor, and each dimension was visualized in a gray-scale block, with darker areas indicating higher values as the imaged was processed in deeper layers.
    When the stage goes deeper, the number of dimensions became larger. In stage 5 there were 2048 blocks with a
    10 10 size. Each dimension placed specific attention on the 2-dimensional input to allow unique visualization.
    During training we randomly divided the training data-set into a training dataset (n Z 632 images) and a valida-tion dataset (n Z 158 images) with a proportion of 8:2 to refine the model during training. To increase the size of the development dataset, we performed data augmenta-tion (Fig. 4), in which each image was rotated and flipped to expand the amount of data by 8-fold. Data augmentation is guided by expert knowledge,13 has been shown to be an effective method of improving the performance of image classification,14 and has been used in visual recognition studies for human diseases.15 Data augmentation was strictly performed only on the training dataset to improve the system’s classification performance. Testing data were not augmented. After augmentation, the development and validation datasets increased to 5056 and 1264 images, respectively.