Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - Springdream

Pages: [1] 2 3
1
No reply so far?
Is somebody using radio recordings e.g. from Streamripper?
Radio music files are a good source for free and legal music.
To improve its quality especially ads and wrong starting time needs to be detected.
=> Would it be possible to output the offset (in s) of the pairs? And also allow to automark based on it.
That would allow to detect that music starts later (not at 0s)

Thank you,
Fred

2
General / Re: Change cache location?
« on: January 04, 2019, 09:19:35 »
Hi,
OK the cache location may be changed now, but the usecase to do so was not mentioned.
Anyway, as written in another thread I am wondering if an option to switch it off would improve performance on >100000 collections.
Best,
Fred

3
General / Re: Similarity seems not optimized for 300,000+ mp3 files
« on: January 04, 2019, 09:16:10 »
Happy New Year!

I found some time to work with Similarity and I am still suffering from such an progressive slow down. It gets so slow that it is almost impossible to use.
Usecase: 100.000 files (radio recordings), same effect for precise comparison (with or without global optimization) and tag comparison

Before I get into more details the good news: clearing the cache helps a lot. I never cleared the cache since 5 years with >1.000.000 items and it speeds up to initial (high) speed. Even when the cache is cleared more often and at the end of the comparison.

The CPU load goes down from 100% to 60% but recovers within 30s... or some minutes. At low CPU load the bottleneck is the disk access speed of the data disk (the queue is at maximum).
Then slowly CPU gets to be the bottleneck again.

For me it looks like the cache is not only adding less benefit at larger number of files but is really contraproductive.

=> I'd suggest to add an option to keep it off. Or (if you can imagine the root cause of that issue) implement it differently.

Thank you,
Fred

4
Hello,
meanwhile with V2.4 (I don't know since when exactly) the behavior is different again. All redundant information is removed from the super groups.

I am wondering how to get all files in a "Super Group" (smart group) from the scipt API?

=> usecase: for Radio files: I like to keep the file with the median of the durations of all the files that are similar. Therefore I need to know the individual durations, but API provides only pairs without the information to which grouping# it belongs

Best,
Fred

5
Bugs / Re: Slow scanning even with cache
« on: September 19, 2018, 17:37:37 »
I guess you have simply too less RAM.
Consider 8GB+ for your collection...

6
General / Re: Similarity seems not optimized for 300,000+ mp3 files
« on: September 18, 2018, 08:50:34 »
I do also often use grouping with a similar usecase than described at the beginning:

As soon as you use grouping not every item needs to be compared with all others but only each item in group e.g. 1 needs to be compared with group 2.
=> it should be linear, isn't it?

A further improvement might be: as soon one match is found (often that meand there is one song already double) further comparing could be stopped for that items as it is not neccessary to know that there are more than one duplicates...

Also I notice that count goes up to number itema Group 1+2. Group 2 will be the one that can be deleted with automarked files than the count should only go up to number items of group 2?!

Best, Fred

7
Hello,
currently I get all my songs from radio. I use streamwriter to continuously write e.g. 40 channels (italian, frensh,top100...) to disk. Within a week you get 100.000 new songs or more.

Now, the main challange when it comes to delete duplicates is to delete files that have somebody talking at the beginning or there is another song not yet finished.

=> a new criteria that indicates the correct beginning of a song is helpful, but I am not sure if it is possible to get the information from e.g. Musicbrainz if a song starts later (because of song before) than at 0seconds. Often the song is identified properly even if it begins e.g. 20sec later.

=> an option without internet information could be (it's not done elsewere! so it could be something new): list the delay of all found matches. The file with the smallest delay could be the original (without speach at the beginning) or the file with the least disturbance.

At the end that would allow to retrieve high quality songs even from radio station recordings..

Best,
Fred

8
maybe you should spend a bit more time in it... as it is the only tool in the market that EASILY does what is required for removing duplicates...
use precise algorithm >83%
then analyze all to tetect quality which is a mixture of clipping bitrate max frequency ect.
finally set a rule for automark all
and then delete marked files...

9
Bugs / Re: Slow scanning even with cache
« on: October 15, 2014, 18:56:24 »
The slowing down also applies to me. First some 1000 are quite slow, then at let's say 10000 its 1 song per second.
I guess similarity has to compare each file against each other  meaning time effort is proportional to faculity of n (N!)?

I also have the feeling that the cache does not speed up too much.

ALSO another question on the cache:
1) does it slow down if there are too many item in the cache that do not exist anymore? AND
2) if so maybe a function like "delete cache items that do not exist anymore" would be nice
3) I have the feeling that the cache is relating to absolute paths of the files. Everytime there is a changed folder name all files below that folder are lost in the cache. Maybe the cache could refer to some other file properties like a hash, checksum ect. or name, size and date?

10
Bugs / Re: Slow scanning, crashes, etc
« on: October 15, 2014, 18:45:28 »
could it be that the algorithm to compare the songs is slowing down the progress and not the calculation of the fingerprint? also for >100000 songs in the cache the cache seems to get ineffective?!

11
News / Re: Beta version 1.9.2
« on: October 15, 2014, 18:40:56 »
I also would like to try the new method you mentioned. When do you think will you make it available to the public?

12
News / Re: Beta version 1.9.2
« on: October 15, 2014, 18:39:50 »
Hello,
one suggestion regarding the time estimatin (which I like): for big selections files and folders take quite a long time to be scanned completely. Till then the estimate makes no sense. Maybe the scan could be finished completely in the beginning..?!
Regards...

13
General / Re: Open CL and NVIDIA CUDA - GeForce 8600GT
« on: October 12, 2014, 16:35:14 »
could resolve it by uning quite old driver for my n460gtx

14
Bugs / Re: Slow scanning, crashes, etc
« on: October 12, 2014, 13:14:27 »
The initial effect of slowing down also applies to me using 191 192 beta...
Any solution?

15
General / Re: Open CL and NVIDIA CUDA - GeForce 8600GT
« on: October 12, 2014, 13:02:36 »
Hello,
I have the same issues on my GTX460.
Although the log says:

2014-10-12 11:43:06   OpenCL: NVIDIA CUDA - GeForce GTX 460
2014-10-12 11:43:06   OpenCL: GeForce GTX 460 version 340.52 (OpenCL 1.1 CUDA)
2014-10-12 11:43:06   OpenCL: Work group size 64
2014-10-12 11:43:06   OpenCL ready...

When I do the benchmark I get
Calculation error or device busy

The log repeatedly says

2014-10-12 11:44:05   COpenCL::Process:416 failed [false]
2014-10-12 11:44:07   COpenCL::Process:416 failed [false]
2014-10-12 11:44:15   COpenCL::Process:416 failed [false]
2014-10-12 11:44:25   COpenCL::Process:416 failed [false]
2014-10-12 11:45:43   System error [997]
2014-10-12 11:45:43   COpenCL::Process:416 failed [false]
2014-10-12 11:45:43   System error [15100]
2014-10-12 11:45:43   CAudioComparer::BlockCL:1035 failed [false]
2014-10-12 11:45:43   System error [15105]
2014-10-12 11:45:43   CAudioComparer::ProcessCL:1124 failed [false]
2014-10-12 11:45:43   OpenCL use failed
2014-10-12 11:45:43   System error [997]
2014-10-12 11:45:43   COpenCL::Process:416 failed [false]

I have no idea of Similarity finally uses now OpenCL or not...

Any solution?

I am using Win7 Prof 64 bit



Pages: [1] 2 3