NVVL (NVIDIA Video Loader) is a library to load random sequences of video frames from compressed video files to facilitate machine learning training. It uses FFmpeg's libraries to parse and read the compressed packets from video files and the video decoding hardware available on NVIDIA GPUs to off-load and accelerate the decoding of those packets, providing a ready-for-training tensor in GPU device memory. NVVL can additionally perform data augmentation while loading the frames. Frames can be scaled, cropped, and flipped horizontally using the GPUs dedicated texture mapping units. Output can be in RGB or YCbCr color space, normalized to [0, 1] or [0, 255], and in float, half, or uint8 tensors. Using compressed video files instead of individual frame image files significantly reduces the demands on the storage and I/O systems during training. Storing video datasets as video files consumes an order of magnitude less disk space, allowing for larger datasets to both fit in system RAM as well as local SSDs for fast access. During loading fewer bytes must be read from disk. Fitting on smaller, faster storage and reading fewer bytes at load time allievates the bottleneck of retrieving data from disks, which will only get worse as GPUs get faster. For the dataset used in our example project, H.264 compressed .mp4 files were nearly 40x smaller than storing frames as .png files.