MediaAgility Compares GCP Against AWS for Running ML Training on GPUs for a Medical Imaging Company

To make expensive medical imaging available and accessible on the move, the Customer produces portable ultrasound imaging devices. They believe efficiently engineered tools for ultrasound imaging coupled with ML help medical practitioners in diagnosis and quick responses to medical cases.

The Customer used ultrasound images to train their hundreds of ML/Deep Learning models on Amazon Web Services (AWS). They wanted to improve the training of their Deep Learning models’ on 3 parameters – speed of training, the price per run, and ease of use. Hence, they decided to evaluate Google Cloud Platform against AWS through Deep Learning model training experiments.

They chose MediaAgility as their digital consulting partner to carry out the extensive evaluation process aimed at benchmarking GCP against AWS for Deep Learning model trainings.

The Customer provided a representative model out of their model gallery. It was a multi-layered convolutional neural network built on Tensorflow. Also, training data and containerised training code was shared with the MediaAgility team. The team set up separate projects on GCP and AWS environments, ran the training experiments on 3 kinds of GPUs – Tesla K80s, P4, and V100, and also experimented with different machine types. 

The training routines ran in multiple iterations. Training time and cost performance on GCP were improved after each iteration by evaluating the impact of preemptible VMs, different machine types, different hardware accelerators (for example, GPUs/TPUs). A report recorded the time and cost performance benchmarks of moving the Deep Learning workloads to GCP.

The experiment results against the three parameters of ease of use, cost, and speed were as follows –

Ease of Use Improvements:

  • Ease of experiment management with one-click deployment of Kubeflow Pipelines.
  • Since, requesting a preemptible VM in GCP didn’t have to go through the bidding process as in AWS, the Customer could easily deploy preemptible VMs at a more predictable pricing.

Cost Advantage:

  • GCP’s resource-based pricing for Custom Machine Types allowed selecting only as much vCPU and RAM as required in contrast to AWS’s Fixed Machine Types.
  • Overall, the experiments saw, on average, 30% cost savings on similar hardware configurations moved to GCP. There was a list price difference between AWS and GCP, and GCP also applied Custom Machine Types, sustained usage discount, and committed usage discounts. 
  • As much as 75% savings was observed with preemptible VMs on GCP, as compared to AWS.

Speed Advantage:

  • In many cases, the time taken for each experiment was reduced at an average of 18% on GCP as compared to AWS. 

At the end of the engagement, MediaAgility successfully trained the Deep Learning model on GCP with improved speed and performance. The project on GCP showcased Google’s capabilities in automatic scaling of ML/Deep Learning training. 

Get in touch to get a free assessment of Deep Learning training deployment to estimate how much you can save on your cloud costs using GCP services. Contact us at [email protected]