Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - hsei

Pages: 1 [2] 3 4 5
Wishlist / Re: Saving the scanned file list
« on: March 07, 2012, 19:34:03 »
Good idea!
Most of work is already implemented by cache files. Task can be achieved in principle by having different clones of similarity with own cache files started by /portable and moving cache files outside of program. But a "save as / load from" feature within similarity would be a real timesaver.

General / Re: opencl options greyed out
« on: October 08, 2011, 12:41:02 »
Current version is 11.8, but that may change within weeks.

I suppose that will be a useful feature. Brute force comparison (everything with everything) is not feasible with large amounts of files. Reducing search space by thresholds like length was a first improvement and introducing the group vs. group feature brought large amounts to acceptable times.
The possibility of further restriction of fingerprint comparison to files only with sufficient coincidence in tags introduced in latest version 1.6.2 brought down processing times for me from hours to minutes. Of course you lose duplicates which are totally mistagged, but this can often be tolerated. A more exhaustive search can be done nevertheless later on.
I prefer software that can be tailored to ones needs by configuration.

I would not recommend to work on such a large number of files in one pass. Even with 8 GB of RAM on a 64-bit system similarity tends to get unstable between 50k to 100k files. You can work on part of your data, move them to a master directory tree, and add more chunks of files in additional trees on later passes by using the group separation feature. Keep your cache between passes, this speeds up the process drastically. If you haven't lost your cache file, similarity should run much faster after a restart even though it seems to start from 0 again. On a 32 bit system your cache file might have exhausted the 2 GB RAM limit.
If you have enough RAM and CPU power, you can run several instances of similarity parallel with the /portable switch. By that you can use more than one cache (located in your program directory). You should use parallel copies of the similarity program (with a copy of the cache of previous runs) to make full use of this approach. If one instance crashes and is not restartable, you don't lose all of your work.
It's the general idea of modularity: Break down the problem to smaller ones which are easier to handle (divide and conquer).

Wishlist / Re: Tag Merging
« on: September 07, 2011, 15:45:15 »
I agree that fully automatic tag merging could be a dangerous feature, but a little more automatism would be helpful. Overwriting empty fields e.g. would be a reasonable choice. Another option could be to preselect direction of replacement between groups. When you have a large carefully edited collection and you just want to replace some songs with others of better quality, that would be helpful. A very desireable option in such cases would be the possibility to switch songs between two groups (directories) after merging/replacing tags. At the moment this is a rather tedious procedure (edit - select - copy all - apply - move). Especially the "move to directory"-feature could be enriched by a switch or replace option.
The topic here is very similar to thread "Automatically complete id3 tags"

Bugs / No sorting in some columns of result window
« on: June 01, 2011, 21:20:51 »
Sorting by clicking in header field works for file, content and tags column, but not for precise and all further.
(1.6.0 build 1200)

Bugs / Re: OpenCL on ATI HD 4870X2
« on: May 22, 2011, 00:47:16 »
Just follow the download link in "news - beta version 1.6.0". It has the latest build sometimes mentioned with an addendum in that thread, but often not.

Bugs / Re: OpenCL on ATI HD 4870X2
« on: May 15, 2011, 20:34:53 »
Build 1141 no longer shows "calculation error" in benchmark. But calculation time and CPU load is roughly the same with and without OpenCL enabled.

Bugs / Re: Apply in edit window loses changes
« on: May 01, 2011, 23:01:52 »
Where is build 1134 available? ist still build 1133.

Bugs / Apply in edit window loses changes
« on: April 28, 2011, 11:33:05 »
"Copy all" does not copy file name, which could be tolerable, but when you change anything in the file name field all previous changes in the tab region are lost after pressing "Apply".

News / Re: Beta version 1.6.0
« on: April 28, 2011, 11:18:20 »
In order to express it in different terms:
Offline activation is for those cases where the computer on which similarity in premium version has to be started has no connection (or not always) to the internet. As a side effect online control of subscription key is no longer done at every start of similarity, which seemed to me unnecessarily over-anxious.

Wishlist / Re: Good default settings
« on: April 22, 2011, 17:41:12 »
To make my recommendations more clear:
If I use the precise algorithm alone I get less than 1 % false pairs at a threshold of 70 %.
If I use the standard algorithm ("content") at a threshold of about 90 % additionally, I get a few percent additional pairs (with a "precise" rating close to 0), but with a higher false pair probability.
I don't trust in auto-marking, I just use it as "click-saver".

When you enabled OpenCL in the corresponding options tap you better start a benchmark run first. I got an "calculation error" (AMD/ATI 5570) and most comparison results gave false 100% indication.
Only way out in this case: Switch off OpenCL (CUDA).

In %APPDATA%\Similarity\logs or in \logs subfolder of Similarity folder (with exe) when started with /portable switch.
Problem may happen too if Similarity is started twice inadvertantly. If you have a large cache it takes some time until Program shows up. In the meantime you may unpatiently have tried to started it again.

Wishlist / Re: Good default settings
« on: April 07, 2011, 21:16:56 »
I agree partially:
A "precise threshold" of 85-90% is reasonable. But I additionally use a standard threshold of 80% (they are combined by OR), since sometimes the precise algorithm completely misses similarity pairs. Of course theses "standard" candidates have to be examined by listening since the standard algorithm is much less reliable, but rather often I found "true" similar pairs.

Pages: 1 [2] 3 4 5