Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - Springdream

Pages: [1] 2 3 4
Bugs / Re: Activation not working
« on: March 18, 2021, 21:21:33 »
It looks like we have the same behavior again. I always get Bad Gateway. Also for offline activation

sorry to say, I can't image it is possible to comment on script how-to and to teach js programming....

General / Attached new sript: scan-samename with speed improvement
« on: February 07, 2021, 17:11:40 »
I have two folders and compare only files with same name (in that case name&ext).
The original version takes much longer as it puts all files in the queue and compares names in the next step.

Here, name comparison is done in the script with array and manual name comparison. Only pairs are added to the queue.

@Admin: maybe it is worth to open a new forum group to share sample scripts...

General / Attached new script: Copy GOOD>BAD/new subfolder
« on: February 07, 2021, 17:07:56 »
If you want to compare and then copy better files to other folder/new.
This way you put new (only better) files in separate folders.
Results are created as batch files that you may launch separately. It also creates a second batch to delete the worse files.

I am thinking of a simpler approach: the song is identified correctly even if it starts e.g. 20s after the beginning (because of ads or scpeech at the beginning).
=> a song that has delay 0s (starts right away) is much better than a song that starts later, because you need to listen to the rubish at the beginning

Bugs / Re: Clear Cache sometimes freezes
« on: February 02, 2020, 09:09:26 »
typically >60.000 images in my case although I think it is the same with audio, but when I PAUSE and then clear the cache it always works.

At this point I'd relly hope you could address the primary reason why I delete the cache: it slows down more and more and speeds up when the cache is deleted.
=> like in the global optimization I'd recommend to disable the cache also for pics comparison. Or introduce an option to disable (or limit) it in general

Bugs / Clear Cache sometimes freezes
« on: October 13, 2019, 19:08:31 »
first, when comparing many many pictures the cache is more slowing down than it helps.
=> I suggest like the setting for very large collections disable the cache also for picture comparison
=> from time to time I press the clear cache button durin operation. That works in 1/3 cases, but the rest causes no response, althoug after the restart I see that the cache has been deleted.
Would hope you are able to fix/improve this. Please let me know if you need more information.

No reply so far?
Is somebody using radio recordings e.g. from Streamripper?
Radio music files are a good source for free and legal music.
To improve its quality especially ads and wrong starting time needs to be detected.
=> Would it be possible to output the offset (in s) of the pairs? And also allow to automark based on it.
That would allow to detect that music starts later (not at 0s)

Thank you,

General / Re: Change cache location?
« on: January 04, 2019, 09:19:35 »
OK the cache location may be changed now, but the usecase to do so was not mentioned.
Anyway, as written in another thread I am wondering if an option to switch it off would improve performance on >100000 collections.

General / Re: Similarity seems not optimized for 300,000+ mp3 files
« on: January 04, 2019, 09:16:10 »
Happy New Year!

I found some time to work with Similarity and I am still suffering from such an progressive slow down. It gets so slow that it is almost impossible to use.
Usecase: 100.000 files (radio recordings), same effect for precise comparison (with or without global optimization) and tag comparison

Before I get into more details the good news: clearing the cache helps a lot. I never cleared the cache since 5 years with >1.000.000 items and it speeds up to initial (high) speed. Even when the cache is cleared more often and at the end of the comparison.

The CPU load goes down from 100% to 60% but recovers within 30s... or some minutes. At low CPU load the bottleneck is the disk access speed of the data disk (the queue is at maximum).
Then slowly CPU gets to be the bottleneck again.

For me it looks like the cache is not only adding less benefit at larger number of files but is really contraproductive.

=> I'd suggest to add an option to keep it off. Or (if you can imagine the root cause of that issue) implement it differently.

Thank you,

meanwhile with V2.4 (I don't know since when exactly) the behavior is different again. All redundant information is removed from the super groups.

I am wondering how to get all files in a "Super Group" (smart group) from the scipt API?

=> usecase: for Radio files: I like to keep the file with the median of the durations of all the files that are similar. Therefore I need to know the individual durations, but API provides only pairs without the information to which grouping# it belongs


Bugs / Re: Slow scanning even with cache
« on: September 19, 2018, 17:37:37 »
I guess you have simply too less RAM.
Consider 8GB+ for your collection...

General / Re: Similarity seems not optimized for 300,000+ mp3 files
« on: September 18, 2018, 08:50:34 »
I do also often use grouping with a similar usecase than described at the beginning:

As soon as you use grouping not every item needs to be compared with all others but only each item in group e.g. 1 needs to be compared with group 2.
=> it should be linear, isn't it?

A further improvement might be: as soon one match is found (often that meand there is one song already double) further comparing could be stopped for that items as it is not neccessary to know that there are more than one duplicates...

Also I notice that count goes up to number itema Group 1+2. Group 2 will be the one that can be deleted with automarked files than the count should only go up to number items of group 2?!

Best, Fred

currently I get all my songs from radio. I use streamwriter to continuously write e.g. 40 channels (italian, frensh,top100...) to disk. Within a week you get 100.000 new songs or more.

Now, the main challange when it comes to delete duplicates is to delete files that have somebody talking at the beginning or there is another song not yet finished.

=> a new criteria that indicates the correct beginning of a song is helpful, but I am not sure if it is possible to get the information from e.g. Musicbrainz if a song starts later (because of song before) than at 0seconds. Often the song is identified properly even if it begins e.g. 20sec later.

=> an option without internet information could be (it's not done elsewere! so it could be something new): list the delay of all found matches. The file with the smallest delay could be the original (without speach at the beginning) or the file with the least disturbance.

At the end that would allow to retrieve high quality songs even from radio station recordings..


maybe you should spend a bit more time in it... as it is the only tool in the market that EASILY does what is required for removing duplicates...
use precise algorithm >83%
then analyze all to tetect quality which is a mixture of clipping bitrate max frequency ect.
finally set a rule for automark all
and then delete marked files...

Pages: [1] 2 3 4