Yes, indeed cache helps to skip decoding and preproceesing file procedure, but the time for this procedure is linear, ie calculated only once per file.
Here example, just pretend we have more realistic very fast computer what can prepare 10 caches in 1 sec and compare 100K fingerprints in 1sec.
N | [(N+1)*N/2 / 1000000 / 3600] | [N * 10 / 3600] | % preparing time |
10000 files | 0,14 hours | 0,28 hours | 66,66 % |
100000 files | 13,89 hours | 2,78 hours | 16,67 % |
300000 files | 125,00 hours | 8,33 hours | 6,25 % |
1000000 files | 1388,90 hours | 27,78 hours | 1,96 % |
You see for larger files amount caching importance is decreasing.
This calculation is idealistic without duration skip mechanism (disabled).