One of the most time consuming and frustrating tasks we
encountered during our first 360 3D video productions was finding the
optimal encoding settings for each of the currently available VR
headsets. Each platform supports different resolutions, frame rates,
codecs, and bitrates. This article explains the settings we started
with, what we learned from analyzing some of the legends in the field
(like Chris Milk and Felix & Paul), and finally we’ll share a simple
yet powerful free tool we built to help you encode your VR video
content with the best possible settings. Let’s go!
We knew that the playback resolution wasn’t great, and some viewers, even though they loved the VR experience, commented on this fact as well. We told them that our camera filmed in a much higher resolution, but that we’re only able to display 25% of the pixels that we shot due to hardware limitations of the display devices. We also told them how you lose a lot of resolution since you spread the pixels out over the surface of a sphere, and then on top of that look at it through two big lenses which magnify each pixel of your phone screen (in case of Gear VR).
All reasonably valid points, but still we felt we needed to do something about this. We started to scour the dark corners of the internet to find out who had the magic answer, but there were many conflicting opinions. Therefore we decided to analyze what some of the legendary VR video producers were doing to solve this problem…
Chris Milk, like most of us lesser mortals, is using the h.264 MP4 codec. However, the key is in the profile and level he uses. We encountered these settings often in our encoding tools, but usually didn’t pay a whole lot of attention to them. Now we know better..
The Baseline profile is primarily used for decoders with limited computing power, like mobile applications. Google also recommends the Baseline profile for h.264 video playback on Android. This makes it the go to profile for mobile VR, like Google Cardboard and Gear VR.
The level describes the max resolution and bitrate acceptable. Each device supports videos up to a certain h.264 level. For example, iPhones only support videos up to level 3.1 (1920×1080@30 is the absolute max for iPhone 5.. so pretty appalling for VR video playback), while most Android phones will play level 4.2 videos just fine.
The resolution of 3840×2160 is the official Ultra HD (UHD) standard for 4k videos, also referred to as 2160p. This resolution has a 16:9 aspect ratio, not the 1:1 aspect ratio you usually start with when shooting stereoscopic VR content. This means that the video resolution is compressed in the vertical direction. So in other words, these videos have more pixel density on the horizontal axis than on the vertical axis, and so everything in the scene looks short and fat. The following diagram might make things more clear:
The funny thing is that when you display this stretched/compressed video on the inside of a sphere for playback in a VR headset, everything is pushed back to the correct proportions again! And by using a resolution of 3840×2160 instead of 2048×2048 (what we initially used), Chris Milk has a whopping 97% more pixels to display! A massive improvement!
Amazingly, he even uses this UHD resolution for the VRSE Cardboard app, something which we thought would not be playable! However, my 2 year old Nexus 5 phone is more than able to smoothly play these 4k videos, resulting in an amazing experience, even when using a low-end cardboard viewer. The downside of this resolution is that there is a limit at 30 frames per second. Higher frame rates simply won’t play on the current (mobile) VR devices at these resolutions.
You may notice the rather strange 3840x1536 resolution, which would imply a serious reduction in pixels compared to the 3840x2160 pixels of Chris Milk. However, the trick is revealed once we study a frame from the video:
In the screenshot you notice that you’re unable to see the bottom of Clinton’s desk. However, when viewing the video in the Felix & Paul Gear VR app, you can see the bottom of his desk. What kind of voodoo is this?! Well, Felix & Paul realized that only a limited number of pixels can be displayed as video on the current line of (mobile) VR devices, and they also realized that 95% of the action is happening at eye-height, not above or below you (this obviously varies with each piece of content). So their solution was brilliant: display only a certain band of pixels as video, and show the top and bottom parts as still images.
This masterful trick allowed them to effectively achieve 3840×3840 pixels in resolution, of which only 1536 pixels in the vertical dimension are displayed as video, while the rest of the pixels are still images. This reduced amount of video pixels also allowed them to push their production to 60 frames per second!
While the resulting viewing experience is unparalleled, there are some serious limitations to this trick. First, it requires a crapton of extra work and technical expertise to create a custom video player app which plays hi-res 360 videos and switches between top/bottom images in sync. Also, if you film something from a drone for example, where the bottom of the video does move, using still images there is no option. Nevertheless, a very creative way to push this new medium to its limits! But Felix & Paul didn’t stop there..
While the video is 9.21 minutes long, super high resolution, as well as 60 frames per second, the file size is “only” 825MB. Mediainfo teaches us that this is due to an, at first sight, rather low 12.3Mbps bitrate. Don’t let this bitrate fool you though, because instead of using the common h.264 codec, they encoded their files with the more powerful h.265/HEVC codec, which essentially gives you the same video quality at half the bitrate.
So their 12.3Mbps h.265 video is comparable in quality to a 24.6Mbps h.264 video, resulting in half the file size! An additional benefit of h.265 is that it supports higher resolutions, bitrates, and frames per second than h.264, making it an ideal candidate for future VR video players. The only major downside is that there are hardly any devices capable of playing back h.265 videos, but fortunately the Samsung Gear VR phones are! (up to 3840×2160@30 for now)
So, we use h.265 for Gear VR to get an optimal balance between video quality and file size. If you would wish to play your videos at 60fps, Samsung recommends 2048×2048.
For Cardboard Android we can use Chris Milk’s settings for the highest possible resolution. However, if you are concerned about file size, you feel that the Cardboard experience is sub-optimal anyways, or if you want maximum compatibility with older smartphones, you can always reduce the resolution to 2k or even 1080p, like the New York Times does.
As mentioned before, iOS has put hardware restrictions on the maximum video size you can play on their devices, which is set to 1080p at 30 fps for iPhone 5 and 1080p at 60fps for iPhone 6. Because of this terribly low resolution and because developing for iOS is a torturous process, we are not a massive fan of VR for iPhone. However, many clients request it, so that’s why we included it in the list.
For Oculus it is hard to determine what settings to use, because everyone’s computer will be different. However, based on Oculus’ minimum system requirements, it should in theory be possible to play high res, high frame rate h.265 videos smoothly, provided that the correct codecs are installed on the system and a powerful 360 video player is used. For now though, we use h.264 and UHD resolution for maximum compatibility since even our Mac Pro has difficulty playing h.265 files.
This open-source command line tool has full support for h.265 and is used to power most online video encoding services. It is great, except for the fact that learning the right commands and typing them over and over again is more frustrating than mowing the lawn with a nail clipper. That is why we decided to write a little Python script to make our life a bit easier.
We now simply copy our hi-res, uncompressed files to an input folder, run the program, select which VR platforms we would like to render (Gear VR, Oculus etc.), press “Start encoding” and then sit back to see our output folder fill up with perfectly rendered files ready for playback in the respective VR headsets!
Our ignorant phase
We at Purple Pill VR film in stereoscopic 3D on all sides, which ultimately results in one video file containing 2 panoramas, one for your left eye and one for your right eye, stacked on top of each other. This file, a .mov or .avi, has an exotic 1:1 aspect ratio and a resolution of 4096×4096 pixels at 60 frames per second. Not a single VR headset is currently able to play this back smoothly, so we scaled the resolution down until we had smooth playback, which was around 2048×2048@60.We knew that the playback resolution wasn’t great, and some viewers, even though they loved the VR experience, commented on this fact as well. We told them that our camera filmed in a much higher resolution, but that we’re only able to display 25% of the pixels that we shot due to hardware limitations of the display devices. We also told them how you lose a lot of resolution since you spread the pixels out over the surface of a sphere, and then on top of that look at it through two big lenses which magnify each pixel of your phone screen (in case of Gear VR).
All reasonably valid points, but still we felt we needed to do something about this. We started to scour the dark corners of the internet to find out who had the magic answer, but there were many conflicting opinions. Therefore we decided to analyze what some of the legendary VR video producers were doing to solve this problem…
Studying Chris Milk
One of the biggest VR filmmakers of this moment is Chris Milk from VRSE. We downloaded the VRSE app for both Cardboard and Gear VR for “study purposes”. When we opened their videos in Mediainfo, we noticed several things:- h.264 Baseline profile, level 4.2
- Display resolution is 3840×2160@30 (20 – 30Mbps bitrate)
- Same resolution for Cardboard and Gear VR!
Chris Milk, like most of us lesser mortals, is using the h.264 MP4 codec. However, the key is in the profile and level he uses. We encountered these settings often in our encoding tools, but usually didn’t pay a whole lot of attention to them. Now we know better..
The Baseline profile is primarily used for decoders with limited computing power, like mobile applications. Google also recommends the Baseline profile for h.264 video playback on Android. This makes it the go to profile for mobile VR, like Google Cardboard and Gear VR.
The level describes the max resolution and bitrate acceptable. Each device supports videos up to a certain h.264 level. For example, iPhones only support videos up to level 3.1 (1920×1080@30 is the absolute max for iPhone 5.. so pretty appalling for VR video playback), while most Android phones will play level 4.2 videos just fine.
The resolution of 3840×2160 is the official Ultra HD (UHD) standard for 4k videos, also referred to as 2160p. This resolution has a 16:9 aspect ratio, not the 1:1 aspect ratio you usually start with when shooting stereoscopic VR content. This means that the video resolution is compressed in the vertical direction. So in other words, these videos have more pixel density on the horizontal axis than on the vertical axis, and so everything in the scene looks short and fat. The following diagram might make things more clear:
The funny thing is that when you display this stretched/compressed video on the inside of a sphere for playback in a VR headset, everything is pushed back to the correct proportions again! And by using a resolution of 3840×2160 instead of 2048×2048 (what we initially used), Chris Milk has a whopping 97% more pixels to display! A massive improvement!
Amazingly, he even uses this UHD resolution for the VRSE Cardboard app, something which we thought would not be playable! However, my 2 year old Nexus 5 phone is more than able to smoothly play these 4k videos, resulting in an amazing experience, even when using a low-end cardboard viewer. The downside of this resolution is that there is a limit at 30 frames per second. Higher frame rates simply won’t play on the current (mobile) VR devices at these resolutions.
Funny sidenote
Chris Milk’s video The Displaced, a powerful story about refugees, is also used in the New York Times VR app
for Cardboard. But while VRSE plays this video at 3840×2160
(stereoscopic), the NYT VR app plays this same video at a mere 1920×1080
(monoscopic).
Felix & Paul & h.265
So, we just saw that using Chris Milk’s encoding settings can greatly increase your final video quality. Next to VRSE, the legendary Felix & Paul studio has been at the forefront of VR filmmaking for a long time, and a friend of ours from the Dutch production house Scopic mentioned that their Inside Impact video, featuring Bill Clinton, was one the sharpest looking VR video he had seen so far. So what makes this video so much crisper than other VR content out there? Well, a number of ingenious things..You may notice the rather strange 3840x1536 resolution, which would imply a serious reduction in pixels compared to the 3840x2160 pixels of Chris Milk. However, the trick is revealed once we study a frame from the video:
In the screenshot you notice that you’re unable to see the bottom of Clinton’s desk. However, when viewing the video in the Felix & Paul Gear VR app, you can see the bottom of his desk. What kind of voodoo is this?! Well, Felix & Paul realized that only a limited number of pixels can be displayed as video on the current line of (mobile) VR devices, and they also realized that 95% of the action is happening at eye-height, not above or below you (this obviously varies with each piece of content). So their solution was brilliant: display only a certain band of pixels as video, and show the top and bottom parts as still images.
This masterful trick allowed them to effectively achieve 3840×3840 pixels in resolution, of which only 1536 pixels in the vertical dimension are displayed as video, while the rest of the pixels are still images. This reduced amount of video pixels also allowed them to push their production to 60 frames per second!
While the resulting viewing experience is unparalleled, there are some serious limitations to this trick. First, it requires a crapton of extra work and technical expertise to create a custom video player app which plays hi-res 360 videos and switches between top/bottom images in sync. Also, if you film something from a drone for example, where the bottom of the video does move, using still images there is no option. Nevertheless, a very creative way to push this new medium to its limits! But Felix & Paul didn’t stop there..
While the video is 9.21 minutes long, super high resolution, as well as 60 frames per second, the file size is “only” 825MB. Mediainfo teaches us that this is due to an, at first sight, rather low 12.3Mbps bitrate. Don’t let this bitrate fool you though, because instead of using the common h.264 codec, they encoded their files with the more powerful h.265/HEVC codec, which essentially gives you the same video quality at half the bitrate.
So their 12.3Mbps h.265 video is comparable in quality to a 24.6Mbps h.264 video, resulting in half the file size! An additional benefit of h.265 is that it supports higher resolutions, bitrates, and frames per second than h.264, making it an ideal candidate for future VR video players. The only major downside is that there are hardly any devices capable of playing back h.265 videos, but fortunately the Samsung Gear VR phones are! (up to 3840×2160@30 for now)
“Optimal” encoding settings
After studying how the big guys do it, we decided that the following encoding settings would be optimal for now:Platform | Codec | Resolution | FPS | Avg. bitrate |
Gear VR | h.265 | 3840×2160 | 30 | 10 – 20Mbps |
Cardboard Android | h.264 (Baseline, level 4.2) | 3840×2160 | 30 | 20 – 30Mbps |
Cardboard iOS | h.264 (Baseline, level 3.1) | 1920×1080 | 30 | 10 – 14Mbps |
Oculus Rift | h.265 / h.264 | 4096×4096 | 60 | 40 – 60Mbps |
So, we use h.265 for Gear VR to get an optimal balance between video quality and file size. If you would wish to play your videos at 60fps, Samsung recommends 2048×2048.
For Cardboard Android we can use Chris Milk’s settings for the highest possible resolution. However, if you are concerned about file size, you feel that the Cardboard experience is sub-optimal anyways, or if you want maximum compatibility with older smartphones, you can always reduce the resolution to 2k or even 1080p, like the New York Times does.
As mentioned before, iOS has put hardware restrictions on the maximum video size you can play on their devices, which is set to 1080p at 30 fps for iPhone 5 and 1080p at 60fps for iPhone 6. Because of this terribly low resolution and because developing for iOS is a torturous process, we are not a massive fan of VR for iPhone. However, many clients request it, so that’s why we included it in the list.
For Oculus it is hard to determine what settings to use, because everyone’s computer will be different. However, based on Oculus’ minimum system requirements, it should in theory be possible to play high res, high frame rate h.265 videos smoothly, provided that the correct codecs are installed on the system and a powerful 360 video player is used. For now though, we use h.264 and UHD resolution for maximum compatibility since even our Mac Pro has difficulty playing h.265 files.
Monoscopic
For monoscopic videos (360 2D) the resolutions are slightly different. For example, the Gear VR can play 4096×2048@30 in h.264. Using Chris Milk’s settings, you should also be able to play this resolution for Cardboard on most newer Android phones. However, on Gear VR you could also opt for h.265, in which case the maximum resolution we managed to play smoothly was 3840×1920@30. This was on a Note 4 though. If you want 60fps, Samsung recommends 2880×1440. We did not test this resolution in h.265 yet.UPDATE
It has been more than a year since we wrote this post, and while most of the info is still accurate, some things have changed. For example, we now prefer to use Google’s VP9 codec instead of h.265, since it offers nearly the same compression as h.265, but also runs smoothly on Windows, although the new 1070 and 1080 Nvidia cards are somewhat capable of playing back h.265 now. We also recommend UHD (3840×2160) for Desktop VR now, not 4096×4096. Another interesting development is that some of the latest high-end phones, like the Google Pixel XL, are actually capable of playing UHD h.265 and VP9 at 60fps! Unfortunately we will still have to keep using 30fps for a while to ensure the majority of phones is capable of playing back the video.Encoding tools
None of these resolutions might turn out to be optimal, but for us these work very well for now. But what tools do you need to encode your videos to these resolutions and codecs? We started out with MPEG Streamclip, then tried Compressor in combination with Final Cut Pro, and finally Handbrake. While they all sort of worked, none of them was able to do everything we needed and we kept having to re-render our files because of a missed setting. So we ended up at the most reliable, most powerful, yet most user-unfriendly encoding tool ever created: FFmpeg!This open-source command line tool has full support for h.265 and is used to power most online video encoding services. It is great, except for the fact that learning the right commands and typing them over and over again is more frustrating than mowing the lawn with a nail clipper. That is why we decided to write a little Python script to make our life a bit easier.
We now simply copy our hi-res, uncompressed files to an input folder, run the program, select which VR platforms we would like to render (Gear VR, Oculus etc.), press “Start encoding” and then sit back to see our output folder fill up with perfectly rendered files ready for playback in the respective VR headsets!
Comments
Post a Comment