Subreddit for AV1 video codec
Subreddit for AV1 video codec
AOMedia, all things related to Open Alliance Media group, responsible for AV1 codec creation.
Webm, for posting your encoded content in webm container.
VP9, subreddit for all things related to VP9 encoder.
HEVC, subreddit for all things related to h.265 encoder.
Opus, most efficient audio codec currently available.
Problem first: I have about 5000 hours of video that needs to be transcoded. Video is mostly 1080p or 720p, with some very minor parts being 480p. Since HEVC is not really an option(video needs to be streamable in browser without transcoding), and the main goal of the operation is to reduce storage size. Currently, the 1080p content for example is encoded with H264 at about 10-15 mbps. I found that the quality is reasonably well with a 2-pass encode using ffmpeg and the libaom av1 coded at around 1000-1500 kbps, so I am planning at giving it a go at about 2000kbps. That should reasonably reduce the storage size needed while keeping quality at the required level. Now, initially I looked at cloud transcoding services, but swept away very quickly seeing that they'd charge me upwards of 50 grand to have all that video transcoded, even leaving out the bandwidth requirement for the task at hand.
Alternative that I am planning to go on is to transcode the files in-house. However my current desktop (i5-11600k) is doing a nice 40fps on the first pass, but a measly 1.6fps at cpu_used 4 on the second, actual encode pass). I note that it doesnt seem to fully utilize the cpu, rather just about half of whats there, so I could probably say if I do 2 files in parallel it should scale up to about 3 fps. Shedding out some money isnt really a problem, so I was considering to build a dedicated encoding station. Is there any reliable benchmark advising on which cpu to buy? I was going to just get a 13th gen i7 + mainboard + ram, psus I have lying around, I dont really need a case and storage can be provided via network. How much performance can I expect(PS: This will be done on my business site, where power is "free" in terms of I have a constant use budged I am allowed to use thats included in the rental contract of the premises and there's capacities unused that would allow for about 500-1000 w permanently)
Are there any other options at hand? Any other configurations that would be achievable? Any option to make libaom use the full cpu? I generally dont mind it taking a year or 2 to fully get everything transcoded, even when factoring in that due to improvements in the implementation I could expect about 20% performance gain over that timeframe in the codec alone.
Another thing that I was curious about: Right now it seems there's only 2 options when it comes to video encoding: (a) use a gpu encoder to get a quick and dirty job, good for live streaming, bad for reducing file size and (b) cpu encoder to get a high quality low bandwith output at the expense of exponentially more time.
However given that video encoding should be mostly matrix transforms on the codec side, is there a "GPU-Accelerated" option that uses the cpu encoder but ships off some of those matrix calculations to the GPGPU part of the GPU(as in CUDA/OPENCL etc, not as in hardware encoding unit)? As in a hybrid form of encoding?
I am trying to optimize size of my photo archive and noticed one thing:
Have an image (link) (3.37 MB) in .jpg. I am trying to do lossless JPEG recompression in .avif (so same quality but better codec). I tried to use ImageMagic (win10) (with command: magick mogrify -quality 100 -auto-orient -format avif *.jpg) and GIMP with "lossless" export, but in both cases the size of this image was 6.5 Mb+. (And the same thing with .jxl)
I can't understand "how does it works?" and why a size of target .avif file is bigger than a .jpg source? Or could please someone tell me what articles should I read to understand this process? Many thanks!
Thank you all who responded!
Cross-posting from r/ffmpeg in the hopes of getting some kind of answer:
I upgraded ffmpeg from the latest stable build (5.1.2) to one of the more recent git builds ( 2023-02-02-git-7d49fef8b4 ) by downloading the latest pre-compiled binary for Windows from gyan.dev. However, after doing so, it seems that two-pass encoding (I have tested with both libvpx-vp9 and liboam-av1) is no longer providing status updates for the first pass. My machine works like it is performing its first pass, but it doesn't show any frame progress, fps remains at 0.0, it just shows the time as "577014:32:33.77" and the speed as N/A. Once it "finishes", it proceeds to the second pass, which provides status updates just fine. I downloaded the pre-compiled binary from BtBn and it does the same thing. I have tried a variety of command line configurations for both libvpx-vp9 and liboam-av1 and get the same result.
I have gone back to the latest stable version of ffmpeg for now, but is this the way that ffmpeg's status updates for first pass encoding are supposed to work going forward, or is this a bug? If this is in fact a bug, can anyone else reproduce it?
First of all I have to say that the english is not my native language, so you are going to see a lot of mistakes. I'm sorry.
If you are reading this and have no idea abount VP9 or AV1 encoding, please, read the Google's VP9 encoding guide and the BlueSwordM's encoder tuning guides.
All the values used there are for encoding a file with the intention of archiving it for my own use. So the recommended values are the ones corresponding to VOD encoding using constrained quality. The managed bit rate values are the mean or objective bit rate values. For working with constrained quality encoding and according to the Google's guide once the mean bit rate is selected we calculate the minimum and maximum ones multiplying it for a factor of 0'5 and 1'45.
The first doubt is about AV1. I have not found information about recommended crf and bit rate values for AV1, so I'm assuming that as AV1 is the successor of VP9, it inherits the crf values (as h265 inherits the h264 ones), and for the bit rate, I have read articles that say that it uses up to 30% lower bit rate, so I'm applying a 0'75 factor (25% lower bit rate), if anyone knows which are the recommended values or uses other (justified) factor the information would be welcome. ¿Are those assumptions correct or at least not too wrong?
The second doubt comes from the bit rate. Reading the Google's VP9 guide, a 1.800 Kb/s bitrate is recommended for 1.920x1.080 resolution. But... they recommend the same bit rate for a range between 24 and 30 fps (there are other recommendations for more resolutions but let's start from this point). I'm assuming that a good reference for describing the quality, if we maintain the rest of the parameters that control the quality, could be the amount of bits per pixel (bpp from now), so the recommendation that Google gives is not too precise, it is giving a higher bpp value to lower fps values than to greater ones.
So here comes the idea that explodes in all the mess that I have in my brain. Let's start with some maths:
Given a W x H resolution, a bit rate of BR Kb/s, and an amount of frames per second F we have that:
bpp = (BR Kb/s) * (1/F s/frame) * (1/(W*H) frame/pixel) * 1.000 b/Kb = ((BR * 1.000) / (F * W * H)) b/pixel
If we have a second w x h resolution, with a frame rate of f, and we want that this second video has the same quality than the first one, we should give him the same bpp than to the first one, and we have an equation with a unknown value, br, the bit rate of the second video:
((br * 1.000)/(f * w * h)) = ((BR * 1.000) / (F * W * H)) ---> br = BR * (f/F) * (w/W) * (h/H)
Going back to the Google's recommendation I assume that if 1.800 Kb/s is a good bit rate for 24 and 30 fps values there must be an acceptable range of bit rate values, and the higher acceptable value for 24 fps must be 1.800 Kb/s and the lower acceptable value for 30 fps must also be 1.800 Kb/s, this way the recommendation of 1.800 Kb/s will be correct for both frame rates.
If we take as reference values 1.920x1.080, 24fps and 1.800 Kb/s, the higher acceptable bit rate for 1.920x1.080:24, using the above equation we conclude that the higher acceptable bit rate for 30 fps would be:
br = BR * (f/F) * (w/W) * (h/H) = 1.800 * (30/24) * (1.920/1.920) * (1.080/1.080) = 1.800 * (5/4) = 2.250 Kb/s
And using as a reference 1.920x1.080:30, with 1.800 Kb/s as the acceptable lower bit rate, using the same equation we can calculate the acceptable lower bit rate for 1.920x1.080:24.
br = BR * (f/F) * (w/W) * (h/H) = 1.800 * (24/30) = 1.800 * (4/5) = 1.440 Kb/s
So until now I have established that for using constrained quality the recommended values for VP9 are, a minimum recommended bit rate value of 1.800 Kb/s and a greater one of 2.250 Kb/s, for a vídeo with a 1.920x1.080 resolution and a frame rate of 30 fps. Those will be the reference values for VP9, and the bit rates for AV1 will be those values multiplied by 0'75.
For any other W x H resolution with a F fps frame rate the least recommended bit rate, LBR, and greater recommended bit rate, GBR will be (for VP9):
LBR = 1.800 * (F/30) * (W/1.920) * (H/1.080)
GBR= 2.250 * (F/30) * (W/1.920) * (H/1.080)
I found that the upper bound is correct for almost any FullHD panoramic 1920xH video (H in a 800-1080 range) and for the FullHD 4:3 resolution, 1440x1080. There are movies that have problems with some frames in the middle of scenes with high motion, but those are only a very little amount of consecutive frames: "Jurassic world dominion", when the guy falls into the iced lake after the ice breaks. All the bubbles in the water became blocks with wave patterns (AV1). Or in "The wolf of wall street", in the scene in which the strippers enter running in the office, with the lights flashing and the confetti, it is a blocky mess . I don't remember the name of the guy, I think that he is british, if you search "confetti ruins video" in youtube there is a video that tries to explain inter and intra compression, I guess that more bitrate is needed for encoding those scenes, but I'm not going to sacrifice disk space for improving the quality of a few miliseconds of video. But the rest of the film has a very good quality.
The lower bound gives no any problem when used for encoding animated films.
All the above ideas work well for FullHD content. Even for HD content. But looking at the Google's recommended values, and experimenting if you don't believe what they say, starts a little nightmare. Lower resolutions need higher bpp values, but higher resolutions seems to follow the rule for the bitrate (the recommended values for both 24-30 and for 50-60 fps values enter in the range that the above equations give us for the 50-60 fps range).
What happens with lower resolutions? Why they need higher bpp? greater resolutions also need higher bpp but they "just work well enough" with the recommended values? Are those values experimental and with the time they will be more accurate?
Sorry, I forgot to mention the encodding strings that I use for 1080p content:
ffmpeg -nostdin -i ../video_1080p_ChunkIndex.mkv -map 0:0 -pix_fmt yuv420p10le -colorspace bt709 -color_primaries bt709 -color_trc bt709 -b:v (0'75*GBR) -minrate 0'5 * (0'75*GBR) -maxrate 1'45 * (0'75*GBR) -tile-columns 1 -tile-rows 1 -row-mt 1 -threads 8 -frame-parallel 1 -g 10 * fps -auto-alt-ref 1 -lag-in-frames 48 -cpu-used 3 -aq-mode 1 -crf 31 -passlogfile AV1_CQ31_(0'75*GBR)_C3_ChunkIndex -pass 1 -c:v libaom-av1 -y video-AV1_CQ31_(0'75*GBR)_C3_ChunkIndex.mkv &&\ ffmpeg -nostdin -i ../video_1080p_ChunkIndex.mkv -map 0:0 -pix_fmt yuv420p10le -colorspace bt709 -color_primaries bt709 -color_trc bt709 -b:v (0'75*GBR) -minrate 0'5 * (0'75*GBR) -maxrate 1'45 * (0'75*GBR) -tile-columns 1 -tile-rows 1 -row-mt 1 -threads 8 -frame-parallel 1 -g 10 * fps -auto-alt-ref 1 -lag-in-frames 48 -cpu-used 3 -aq-mode 1 -crf 31 -passlogfile AV1_CQ31_(0'75*GBR)_C3_ChunkIndex -pass 2 -c:v libaom-av1 -y video-AV1_CQ31_(0'75*GBR)_C3_ChunkIndex.mkv
ffmpeg -nostdin -i ../video_1080p_ChunkIndex.mkv -map 0:0 -pix_fmt yuv420p10le -colorspace bt709 -color_primaries bt709 -color_trc bt709 -b:v GBR -minrate 0'5 * GBR -maxrate 1'45 * GBR -tile-columns 1 -tile-rows 1 -row-mt 1 -threads 8 -frame-parallel 1 -g 10 * fps -enable-tpl 1 -auto-alt-ref 6 -lag-in-frames 25 -quality good -speed 0 -aq-mode 1 -crf 31 -passlogfile VP9_CQ31_GBR_V0_ChunkIndex -pass 1 -c:v libvpx-vp9 -y video-VP9_CQ31_GBR_V0_ChunkIndex.mkv ffmpeg -nostdin -i ../video_1080p_ChunkIndex.mkv -map 0:0 -pix_fmt yuv420p10le -colorspace bt709 -color_primaries bt709 -color_trc bt709 -b:v GBR -minrate 0'5 * GBR -maxrate 1'45 * GBR -tile-columns 1 -tile-rows 1 -row-mt 1 -threads 8 -frame-parallel 1 -g 10 * fps -enable-tpl 1 -auto-alt-ref 6 -lag-in-frames 25 -quality good -speed 0 -aq-mode 1 -crf 31 -passlogfile VP9_CQ31_GBR_V0_ChunkIndex -pass 2 -c:v libvpx-vp9 -y video-VP9_CQ31_GBR_V0_ChunkIndex.mkv
Hi, I'm having a problem with SVT-AV1 1.4.1. I'm doing some experiments on a YUV420P8 source and trying to convert it to 10 bits. I tried
--input-depth 10, but even with this parameter, the output remains YUV420P8. Am I missing something obvious? I had no problems converting 8-bit sources to 10-bit ones with AOMEnc, so it's probably something simple.
Another newb SVT question is the
--color-range param, which switches between Studio and Full. I'm unfamiliar with the "Studio" color range. Is it more or less limited compared to Full?
EDIT: appreciate the answers, thanks.
I understand stand that that you can tune more setting and that it should allow for better quality than default aom but a lot of the improvements go mostly go over my head. Sp how does it improve upon av1 encoding and by how much? Thanks.
Is it normal that my Ryzen 7 1700X is approximately 2.5x slower than my Intel i7-4790K when encoding with SVT-AV1 (Preset 2)? Both machines are running up to date Archlinux. Encoding from the same source file with the same arguments. Both use AVX2. The Ryzen machine has an ordinary HDD the i7 runs from M.2 SSD. Both running with CPU set to performance and neither is thermal throttling.
Surely this can't be right.
I want to use liboam-av1 to create some "virtually lossless" video encodes with ffmpeg. I opted for a CRF 23 encode with a 10-bit color profile, a GOP of 60, and a keyframe placed every 60 frames to help improve seeking in the video. Based on the ffmpeg documentation and input from people in a previous thread I made about SVT-AV1 here, this is what I used for my test encode:
ffmpeg -i test.mkv -c:v libaom-av1 -crf 23 -pix_fmt yuv420p10le -g 60 -keyint_min 60 -cpu-used 0 -pass 1 -an -f webm NUL && ^
ffmpeg -i test.mkv -c:v libaom-av1 -crf 23 -pix_fmt yuv420p10le -g 60 -keyint_min 60 -cpu-used 3 -pass 2 -c:a libopus -b:a 192k test.webm
The test encode seems to have turned out pretty well, but is there anything I can add to these commands to further improve the quality of my encoding? Would changing the Tune to SSIM help? Manually setting Tiles for the video? Turning on "row-mt"? Should I be setting "denoise-noise-level" by default as if I was using SVT-AV1? I have quite a few family videos that I need to transfer from DVD to AV1, and I want to make sure that I encode them as best as possible the first time - I don't want to have to come back and re-encode them because of an oversight or an error on my part.
I'd like to superscale my old video which some of them are in 360p or 480p to 1080p or even 2160p by using Davinci Resolve. For this, I need to buy a new computer which CPU should support AV1 encoding capability.
May I ask, which CPU (and also GPU) I should buy for this purpose please? (or at least a keyword that I can google further)
I have read the ffmpeg documentation page for AV1 and the page "Using SVT-AV1 Within FFmpeg" and I have done some test encodes. However, I have some questions that appear to be beyond the scope of this documentation, and I was hoping the experts here on Reddit might be able to clarify some things for me.
Many thanks in advance for your assistance in answering these questions!
I'm encoding an episode around 43min using both av1an and Shutter encoder. In shutter encoder I get around 3-4 fps when encoding while in av1an I only get 1-1.5 fps. Why shutter encoder is little faster ? I'm not using film-grain as Shutter encoder does not support it yet.
Can anyone tell me what command line parameters should I use when encoding with av1an for best CPU utilization (I know my CPU is very old but you got to live what you have). Any help will me much appreciated.
For av1an setup I've used this https://www.reddit.com/r/AV1/comments/10l93ir/trying_to_use_av1an_without_success/.
My pc specs -:
av1an command -:
av1an -i input -e svt-av1 -v "--crf 30 --preset 6 --keyint 240" --pix-format yuv420p10le -a "-c:a copy" -o output
Shutter Encoder (SVT-AV1)-:
CQ 30, copy audio, 10Bit, preset 6 , GOP 240, preserve subtitles. It is a GUI and uses ffmpeg at backend.
After experiencing a few cases of unexpectedly low quality in my encodes, I decided to investigate and see if I could address them. I figured that the keyframe interval had something to do with it, so I decided to use a long generic video (Big Buck Bunny) as a control for various settings.
It seems like the quality suffers until the duration of --keyint has passed, especially when scene changes occur. If I compare encodes of various keyframe intervals, the videos start out looking the exact same. Whenever the first keyframe interval has been reached, the videos start to diverge, with bad scene transitions being the most obvious sign of differentiation. If the streams later place a keyframe at the exact same interval, the videos converge and look exactly alike until the point of yet another keyframe insertion, but the parity remains relatively high.
This to me suggests that there is something wrong with the rate control during the first segment of a video. This can be very problematic if scene transitions occur early, which is the case in this specific example. Why could this possibly be so?
Snippet of SVT-AV1 1.4.1 command used for replicability:
SvtAv1EncApp --crf 63 --keyint 720 --passes 2 --preset 8 -i big_buck_bunny_360p24.y4m
Are there any good sample video files that are good for comparing 10-bit and 12-bit encodes, perhaps even a more complete set of comparison videos for different profiles and levels?
I don't have high-quality source video on hand but if I track some down I could try some comparisons myself. I'm hoping someone has already done this!
I've been unable to get vmaf working in av1an (although av1an itself works!) if I run a command similar to this I get an error:
av1an -i "C:\Users\Ch0nG\Videos\s.mkv" --encoder rav1e --vmaf-path="vmaf.json" --target-quality 90 --probes 10 -c mkvmerge -o "C:\Users\Ch0nG\Videos\s-av1an-rav1e-tq90-vmaf.mkv"
The vmaf json is in the same directory as av1an.
The error is:
libvmaf ERROR could not read model from path: "//?/C:/Users/Ch0nG/Downloads/av1an/vmaf.json" [Parsed_libvmaf_4 @ 000001b1e196a080] problem loading model file: //?/C:/Users/Ch0nG/Downloads/av1an/vmaf.json [AVFilterGraph @ 000001b1e13ab680] Error initializing filter 'libvmaf' with args 'log_fmt=json:eof_action=endall:log_path=//?/C\:/Users/Ch0nG/Downloads/av1an/.4303600/split/0.json:model_path=//?/C\:/Users/Ch0nG/Downloads/av1an/vmaf.json:n_threads=0' Error initializing complex filters. Invalid argument
Based on some searching, I thought that maybe vmaf wasn't compiled into ffmpeg but that's not the case as the flag shows when running the ffmpeg command:
ffmpeg version 2023-01-25-git-2c3107c3e9-essentials_build-www.gyan.dev Copyright (c) 2000-2023 the FFmpeg developers built with gcc 12.2.0 (Rev10, Built by MSYS2 project) configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-zlib --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-sdl2 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libvpl --enable-libgme --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libtheora --enable-libvo-amrwbenc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-librubberband libavutil 57. 44.100 / 57. 44.100 libavcodec 59. 57.100 / 59. 57.100 libavformat 59. 36.100 / 59. 36.100 libavdevice 59. 8.101 / 59. 8.101 libavfilter 8. 54.100 / 8. 54.100 libswscale 6. 8.112 / 6. 8.112 libswresample 4. 9.100 / 4. 9.100 libpostproc 56. 7.100 / 56. 7.100
I've been trying to encode the Blade Runner (1982) Final Cut remux to AV1 with Handbrake in the native resolution, and when I try to use the film-grain=x parameter, the results are much larger than I expect. The denoising that should be happening looks smudgy and ineffective, and increasing the number doesn't help (I've gone as high as 16). Yet when I adjust down to 1920x800 with the same settings (and I assume, scaled down/smaller grain), the size is much smaller, and the results look better. So how can I adjust the denoising and possibly grain synth that's happening to allow for good results in the native resolution? Is this possible using Handbrake?
I'm using framerate same as source, preset 3, no tune, auto profile, auto encoder level, RF 20 to 26, with these advanced options:
Hi, wondering if anyone has data on Nvidia vs AMD for hardware-accelerated encoding in AV1. I can't find a useful comparison, and this is a factor that will influence my GPU choice, so it's important to me. I understand that both the 40 series and the 7000 series are capable of AV1 encoding (and decoding), but if there's a noticeable quality/time difference, I'd like to know. I plan to record 1440p gameplay at 120fps, edit in Davinci Resolve, and upload to YouTube. Thank you for any help.
Edit 3: Got it working thanks to some help via a Github issue. To get it working, I downloaded and placed the following in the same folder:
Python 3.10.9 embedded (latest Python 3.10 as of 1/27/23)
VapourSynth64-portable-R61 (latest as of 1/27/23)
Encoder of your choice
Av1an should now work correctly.
Edit 2: The command below did not work.
EDIT: Did some additional reading/searching. The encoder has to be specified as part of the command. I'll try that a bit later and report my findings.
As an example, the command to run rav1e is something like this:
av1an.exe -i "foo.mp4" -enc rav1e -o "fooav1.mp4"
I'm trying to use av1an on Windows 11. After adding files from VapourSynth and FFMpeg to av1an's directory, the program doesn't do anything. It doesn't error.
I tried to run "av1an.exe -i s.mkv" and there is no output. It just goes back to a prompt.
Thanks for any help you are willing to provide.
I tried some suggestions from different forums but most of them seem to be outdated or didn't work for me. I hope I can get some help here :)