Skip to content

Benchmark data

Each of the following examples is testing max number of inputs and outputs that the instance can handle. It provides 3 variants with different ratio of inputs to outputs. Each output renders 1, 2 or 4 inputs using tiles.

  • Capacity Testing: Each test examines the maximum number of inputs and outputs the server instance can handle.
  • Processing Details: We’re testing examples with 1, 2, or 4 inputs, each using Tiles component.

You can see the most important benchmarks below. Visit our GitHub repository for more results and the exact implementation of the benchmarks.

Full GPU pipeline

For any meaningful workload, a full GPU pipeline (Vulkan H264 decoding, GPU-based compositing, and Vulkan H264 encoding) is the most cost-efficient way to run Smelter. It avoids CPU/GPU memory transfers and removes CPU encoding as the bottleneck, so a GPU-equipped instance will deliver more throughput per dollar than any CPU-only instance at a comparable price point.

The tables below show all Vulkan results for both g4dn instance types. Notice that the results are almost the same, because all the heavy workloads run on the GPU.

g4dn.xlarge

CPU: 4vCPU, Memory: 16GB, GPU: Nvidia T4

InputOutputInput/output ratio
1:12:14:1
720p24fps720p24fps15 / 1530 / 1548 / 12
720p24fps1080p30fps7 / 714 / 724 / 6
1080p30fps1080p30fps6 / 612 / 624 / 6
1080p30fps1440p30fps3 / 38 / 412 / 3
1080p30fps2160p30fps1 / 12 / 14 / 1
2160p30fps2160p30fps1 / 12 / 14 / 1

g4dn.2xlarge

CPU: 8vCPU, Memory: 32GB, GPU: Nvidia T4

InputOutputInput/output ratio
1:12:14:1
720p24fps720p24fps15 / 1530 / 1556 / 14
720p24fps1080p30fps6 / 614 / 724 / 6
1080p30fps1080p30fps6 / 612 / 624 / 6
1080p30fps1440p30fps3 / 36 / 312 / 3
1080p30fps2160p30fps1 / 12 / 14 / 1
2160p30fps2160p30fps1 / 12 / 14 / 1

CPU vs GPU pipeline on g4dn instances

Even when a GPU is available, you can still run all or part of the pipeline on the CPU. The tables below compare the GPU pipeline (Vulkan H264 decode + encode) against CPU-only pipelines (FFmpeg H264 decode + encode at three preset speeds) on the two most common resolutions.

Vulkan H264 on Nvidia cards should produce quality roughly on par with x264’s medium preset. Of the presets benchmarked here, fast is the closest match. ultrafast and veryfast trade noticeable quality for speed.

g4dn.xlarge

CPU: 4vCPU, Memory: 16GB, GPU: Nvidia T4

720p24fps → 720p24fps
Decoder / EncoderInput/output ratio
1:12:14:1
Vulkan H26415 / 1530 / 1548 / 12
FFmpeg H264 (ultrafast)6 / 610 / 512 / 3
FFmpeg H264 (veryfast)3 / 36 / 38 / 2
FFmpeg H264 (fast)2 / 24 / 24 / 1
1080p30fps → 1080p30fps
Decoder / EncoderInput/output ratio
1:12:14:1
Vulkan H2646 / 612 / 624 / 6
FFmpeg H264 (ultrafast)2 / 24 / 24 / 1
FFmpeg H264 (veryfast)1 / 12 / 1-
FFmpeg H264 (fast)-2 / 1-

g4dn.2xlarge

CPU: 8vCPU, Memory: 32GB, GPU: Nvidia T4

720p24fps → 720p24fps
Decoder / EncoderInput/output ratio
1:12:14:1
Vulkan H26415 / 1530 / 1556 / 14
FFmpeg H264 (ultrafast)11 / 1118 / 920 / 5
FFmpeg H264 (veryfast)8 / 814 / 720 / 5
FFmpeg H264 (fast)4 / 410 / 512 / 3
1080p30fps → 1080p30fps
Decoder / EncoderInput/output ratio
1:12:14:1
Vulkan H2646 / 612 / 624 / 6
FFmpeg H264 (ultrafast)5 / 58 / 48 / 2
FFmpeg H264 (veryfast)3 / 36 / 34 / 1
FFmpeg H264 (fast)1 / 14 / 24 / 1

CPU-only instances

Running Smelter on a CPU-only instance is significantly less cost-efficient than using a GPU instance. Smelter does compositing through GPU shaders, so on a machine without a GPU those shaders have to be emulated in software on the CPU, which is slow and competes for the same cores that are already busy with H264 decoding and encoding. As a result, even a relatively beefy c5 instance handles only a fraction of the workload a comparable g4dn instance can sustain.

c5.2xlarge

CPU: 8vCPU, Memory: 16GB

720p24fps → 720p24fps
Decoder / EncoderInput/output ratio
1:12:14:1
FFmpeg H264 (ultrafast)1 / 12 / 1-
FFmpeg H264 (veryfast)1 / 12 / 1-
FFmpeg H264 (fast)1 / 12 / 1-

c5.4xlarge

CPU: 16vCPU, Memory: 32GB

720p24fps → 720p24fps
Decoder / EncoderInput/output ratio
1:12:14:1
FFmpeg H264 (ultrafast)3 / 34 / 24 / 1
FFmpeg H264 (veryfast)3 / 34 / 24 / 1
FFmpeg H264 (fast)2 / 24 / 24 / 1
1080p30fps → 1080p30fps
Decoder / EncoderInput/output ratio
1:12:14:1
FFmpeg H264 (ultrafast)1 / 12 / 1-
FFmpeg H264 (veryfast)1 / 12 / 1-
FFmpeg H264 (fast)1 / 12 / 1-