Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - hsei

Pages: 1 2 3 [4] 5
46
Wishlist / Re: Use ID3v2 in tags
« on: December 13, 2010, 10:39:59 »
Similarity writes ID3v2 tags, but it doesn't always read them.
When I go to the edit window (Ctrl-E), click on a file with both ID3v1 and ID3v2 tags, only the ID3v1 information (30 chars) is presented. When I add the missing information (obtained e.g. from Winamp file info) manually, the long information is properly stored as ID3v2 after clicking "Apply".
It is very probably a bug in Microsoft libraries: When I look at these files in windows explorer, no audio information like title, album or even length is shown in the respective columns, although both ID3 tags are present. I have to state that Microsoft programs are not a reliable source for audio information (at least for MP3).
By the way: What does it mean, when I get a third file entry with all "n/a"s in the edit window, but only two files were in the similarity group?

47
Wishlist / Use ID3v2 in tags
« on: December 11, 2010, 09:20:06 »
I propose to use ID3v2 tags in addition (if present) in tag editing.
ID3v1 is restricted to entries with maximal 30 characters, which sometimes cuts off information.

48
Wishlist / Re: Choose a folder in automatic marking for deletion
« on: December 05, 2010, 11:30:05 »
If you just want to know which titles of folder 2 (candidates) are already in your base tree and you want to delete/disregard them and just copy/move the rest, grouping (available in premium version) does that job. Automatic tagging in free version is almost unusable for larger collections.
You can rearrange groups by bitrate, length, duration etc., but these are no quality criteria in a strict sense and thus arbitrary from that point of view. I agree that these criteria may be sufficient in many cases, but a 256 kBit MP3 encoding obtained by reencoding a 46 kBit file is worse than a one-step 128 kBit encoding. A file with longer duration is typically less prone to truncation at the beginning or the end at an average, but a file with one second missing in the middle and additional 5 sec silence at the end is definitely not the better one though it's longer.
Adding quality to rearranging criteria is promised for a later version, but to be honest: for critical (e.g. historical) samples I would look into the quality comparison (clipping, spectrum ...) or even listen to both files before I would rely to any automatic criteria to delete one of them.

49
Wishlist / Re: Choose a folder in automatic marking for deletion
« on: December 03, 2010, 10:00:23 »
I agree in part to that proposal.
Having a master folder (better: master tree) and deleting automatically all duplicates in a candidate folder is for me not an optimal idea, since the candidate file might be of better quality. There is an option in the full version where  the frequency content is evaluated. It's not perfect but gives reasonable hints in most cases. Having the option to keep the better one and to eventually replace the file in the master tree would be very helpful (just swapping files to have the better one in the master would practically do the same). The full version has the option to form groups of directory trees which is one step in that direction. But at the moment it seems to me that the candidates for deletion are chosen more or less arbitrarily, so I have to go through the duplicate lists manually.
Moving (merging) the residual (non-similar) files from a candidate directory to the master directory is a single drag and drop in the explorer. I doubt that it is worth the effort to add that to the similarity program.

50
Wishlist / Re: Switching files in place
« on: November 15, 2010, 14:02:37 »
I strongly agree - it is nasty to be forced to switch to explorer every time you have a better candidate in the to-check group, especially if you have the source group in a big directory tree.
But I don't like the idea of automatic tag exchange, since source tags tend to be better, at least in a well maintained source (I'd call it target or base) group. The existing tag exchange mechanism is sufficiently easy for me (Ctrl-e + change all + apply).

51
Bugs / Re: Unclear behaviour ...
« on: November 15, 2010, 13:42:15 »
I would add that short can mean a few minutes if you are working with amounts of 100k+ files. That's the time needed to save the cache of about 1GB from memory to disk. When going towards 200k files the system slows down because of excessive swapping (at 2GB RAM) and finally the system almost gets unusable because the GUI handler is swapped in and out. (It's a nostalgic feeling: It reminds me to Windows 2.0 on an IBM AT with 512 KB RAM.)

52
Bugs / Re: false 100% similarity
« on: October 31, 2010, 19:53:18 »
@admin: The last posts would better fit to wishlist.

53
Bugs / Re: false 100% similarity
« on: October 31, 2010, 19:49:35 »
a) The proposal of TBacker does not necessarily mean more effort: Taking e.g. three 20 sec excerpts at begin, middle and end results in approximately the same decoding time as for a single 60 sec probe, but gives higher reliability. There is a little bit more trouble at the borders of the excerpts, but dropped samples in one file and drastically different fade-outs (that are missed in the current version) would then show up most likely. This is probably worth the small loss in speed.

b) You can only be sure to detect all corrupted frames if you decode the whole file. That's clearly a matter of balancing speed vs. quality.

c) Normalizing to the highest peak of a sample/excerpt would not be an good idea. The standard approach for comparison in the frequency domain is to normalize to the overall average (or in other words: the component at frequency bin 0).

54
Bugs / Re: false 100% similarity
« on: October 24, 2010, 09:29:43 »
The newly introduced duration check helps to get rid of most of false positives, but a few 100% "precise" pairs with equal length still remain. They can be easily identified by their tag score below 10% and standard score below 50%, but the implication is: You still can't rely on a totally automatic removal of duplicates, you have to look at the lists.
A hint: All false positive pairs I found had durations below 1 min. So it is not a severe bug, but it is one.

55
Wishlist / Filenames editable
« on: October 23, 2010, 20:03:07 »
I would like to have not only ID3 tags editable in the edit window, but also filenames.
A quick "browse this file" (Ctrl-b) is not always useful, especially if the directory contains many files.

56
Wishlist / Shortcut for Spectrum Analysis
« on: October 23, 2010, 19:57:48 »
I would greatly estimate to have a keyboard shortcut for spectrum analysis. If that shortcut would apply to all selected files in the results list (like e.g. delete), it would be almost perfect for me.

57
News / Re: Version 1.5.1 released
« on: October 16, 2010, 17:30:17 »
О, это именно то, о чём я Вас просил. Спасибо :)
http://www.music-similarity.com/forum/index.php?topic=172.0
Жаль у меня нет возможности заплатить 20$ за регистрацию :(

Со временем появится версия для экс.СССР с более приемлемыми ценами, но пока точных сроков назвать не можем.
Does this mean that prices in other countries are unreasonable?

58
Bugs / Re: New analysis mode
« on: October 03, 2010, 12:04:51 »
OK. besides the VBR problem there is another issue:
At the moment the quality analysis is done for all files in the folder list. I would find it more appropriate to do that only for files in the results list (candidates for similarity). That would reduce processing effort drastically in most cases.
If the result list is empty, you may still perform an analysis on all files. That's a nice feature on it's own - but not within the original scope of removing almost identical tracks (keeping those with higher quality).

59
Bugs / Re: New analysis mode
« on: September 24, 2010, 19:20:37 »
According to frequency bins:
I have found at least another frequency indication of 22028 which means you have bins of not more than 22 Hz range (1024 FFT). Important is not if there is any contribution in that bin but to what extent. In analog to digital conversion a proper low pass filter prior to sampling at 44.1 kHz should start at about 20 kHz and suppress frequencies at the Nyquist limit by at least 30 - 50 dB. An indication of a cutoff frequency where all contributions above have fallen below a given limit of some ten dB is much more meaningful than the maximum bin with some isolated spurious (and negligible) content.
Compressions to low bitrates usually use filters with lower cutoff frequencies but may show minor artificial high frequency content after decompression due to incomplete restauration of the original signal. By just looking at the maximum bin you would rate these files too high.

60
Bugs / Re: New analysis mode
« on: September 24, 2010, 17:28:49 »
OK, I see your problem. Similar behaviour appears in the indications of Microsoft Explorer. They are sometimes completely wrong for bitrates (not only for VBR). You cannot trust these data for sensitive operations like deleting the "worse" duplicate.

Pages: 1 2 3 [4] 5