1. Home
  2. Research Computing
  3. Research VMs – Bitfusion

Research VMs – Bitfusion

Draft Article

This article is still a draft and the information contained is subject to change and/or may result in inconsistent results.

This article contains instructions for using Bitfusion on research virtual machines (VMs) to execute GPU compute jobs such as AI and machine learning workloads.

What is Bitfusion?

Bitfusion is a VMware product that allows remote machines to access GPUs for computing over a low-latency network. This allows sharing of GPU resources between many researchers for a more efficient use of resources. More information about Bitfusion is available here: https://docs.vmware.com/en/VMware-vSphere-Bitfusion/index.html

Bitfusion vs. VACC

Bitfusion is intended to be used for prototyping code that will eventually run on the VACC or for running smaller jobs that do not need the scale offered by the VACC. Bifusion jobs should not run for more than one day.

Using Bitfusion

Bitfusion is available on VMs that have been configured with access to a GPU. If you have a virtual machine that does not currently have GPU access, you can request it here: https://www.uvm.edu/it/research-computing/contact

To run a command or script with Bitfusion, you will need to use the following syntax:

bitfusion run -n 1 -p quota -- command_or_script_here

The switches used in the above command are:
-n : The number of GPUs to use, currently capped at one.
-p : The GPU quota you have been assigned (0.2, 0.4, 0.5, etc.)

An example command is:

bitfusion run -n 1 -p 0.4 -- /users/s/o/someuser/myScript.py

This will run the command ‘myScript.py’ using one GPU with a quota of 0.4.

Quotas

Users who have a Bitfusion enabled VM will be assigned a quota for GPU usage. This quota sets the amount of GPU memory (VRAM) available when running commands or scripts. Currently the GPUs available in the cluster are NVIDIA A100 40 GB cards and the quotas available are:

0.2 – 8 GB
0.3 – 12 GB
0.4 – 16 GB
0.5 – 20 GB
0.6 – 24 GB

When running a Bitfusion command you can choose to use any of the above quotas with a max of your assigned quota. (i.e. if you have an assigned quota of 0.4 you could enter a -p value of 0.2, 0.3, or 0.4).

Available Toolkit/Library Versions

The following toolkit and library versions are available with Bitfusion-enabled VMs.

CUDA – 11.2
cuDNN – 8.1.1
pyTorch – 1.8
Tensorflow – 2.4

If you have a library you would like to use that isn’t listed, please contact us.

Updated on April 25, 2022

Related Articles

Not the solution you were looking for?
Don’t worry we’re here to help!
Submit a Help Ticket