Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - TBacker

Pages: [1]
1
Bugs / Re: false 100% similarity
« on: October 31, 2010, 19:56:20 »
c) Normalizing to the highest peak of a sample/excerpt would not be an good idea. The standard approach for comparison in the frequency domain is to normalize to the overall average (or in other words: the component at frequency bin 0).

I guess my point wasn't clear in this respect.  I basically meant that the amplitudes of the two samples should be made to match (in the compare procedure) before frequency analysis to insure the best accuracy.

2
Bugs / Re: false 100% similarity
« on: October 30, 2010, 05:09:44 »
Similarity designed for scaning music compositions and yes it's scans only 1 min of song. We think about how to solve problem with long durations.

I'm a new user, but I am a radio broadcast engineer with a bit of experience writing some audio apps for my job and personal use (VB6/VB.Net).

How about taking 3 or 4 short (30 second) samples across the length of a file.  Say a 30 second sample at 0%, 25%, 50%, and 75% of the length of valid audio data (ignore those metadata headers / tails!).  You would have to seek in past any silence at the head for the first sample (as the silence can vary even if the cut is the same).

In theory this would produce a "fingerprint" representative of most of the audio without having to scan the whole thing, and more accurate than judging the whole file by one sample.

If this data is compared to a duplicate, and the duplicate is the same audio and length, the data from each of the 4 samples should match up waveform-wise.  If the length is different on the duplicate, say an extra interlude on a remix, the last 2 or 3 samples will not be the same as the original.

This would also detect if a file is corrupted half way through - samples 1 and 2 might match, but 3 and 4 are random noise on the bad cut.

One last caveat - I don't know how your comparison code works, but if the levels are different between the original and dupe, you would need to compensate (make the highest peak of the sample match, i.e. normalize the low sample to the hotter one) before comparing the waveforms.

Sorry for the long post  :-[


3
Wishlist / Show comments, add average match %
« on: October 29, 2010, 16:07:11 »
First of, THANK YOU!  This is an excellent app and a huge timesaver!

My only requests so far (just started using Similarity) are:

1.  Show the comments tag field in the results grids.  My library is all FLAC for uniformity, but many cuts were converted from other lossier formats.  I use the comments field to note the original format and quality, info which would be critical in selecting dupes to pull from the library.

2.  Add a "% Total" column which is an average of content, tags, and precise.  It would provide another excellent way to sort and look at the results

3.  Sync the file analysis and dupe search results.  Key needs here are checking if dupes have bad quality so we can make a more educated pruning, items marked in either view mark the same file in the alternate view, etc.  Possibly duplicate the analysis rating column on the duplicate scan grid.

Thanks!

Pages: [1]