Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - muse2u

Pages: [1]
1
Many use the TT Dynamic Range Meter as a tool to distinguish between audio tracks with a high dynamic range and those that have "loudness", "brickwall" issues. My understanding is that the DR Meter algorithim determines the difference (the range) between the highest and the lowest volume. The higher the DR value (eg 13-14 range) the greater the dynamic range and the less "brickwalled" an audio track is. Therefore for many a higher DR value translates to a higher quality audio track.
As far as I can tell, in Similarity the Max (abs) is the closest analysis result to a dynamic range evaluation. My understanding is that the closer this value is to 1.0, the higher the quality of the audio track. But more often than not I find that an audio track with a high TT DR value does not necessarily have a high Max (abs). I assume that the algorithms being used are different. As far as I can tell, the TT DR Meter measures the actual dynamic range whereas the Max (abs) algorithm measures the dynamic range relative to the maximum potential range for the characteristics of the track (bit rate, sample rate, etc).
For example an audio track could have a high dynamic range (ie high TT DR) but due to the characteristics of the track, is not making use of the full range potential and therefore gets a lower Max (abs).
Am I interpreting the Similarity algorithm correctly? I am a little nervous about deleting audio tracks I have with high TT DR values while Similarity's "quality" rating penalizes them due to lower Max (abs) values.

2
General / Re: Similarity efficiency (speed) and recovery/restart.
« on: October 25, 2012, 21:45:27 »
1. So that means that the total files in the counter includes images. Which explains the large number. I didn't think that I had that many audio files.

2. Good suggestion to backup cache file. The only scan results that added entries to the cache accumulator number, were a single short ananlysis. All other long analysis runs either failed during execution (not necessarily Similarity caused) or when the program was shutdown after a long analysis (including the last one I mentioned in the above post).

3. I don't quite understand the use of groups but will investigate. Are you suggesting that if the file folders of a Similarity analysis run are not assigned to a group, no file comparisons take place. That is what I am trying to accomplish. I do not want an automated duplicate file check to take place during the analysis. I don't have a problem in Similarity getting a fingerprint of a few seconds of each song during the analysis and then storing the results in the db but I don't want the added overhead of the actual dup check processing during analysis. Those are things I would want to have the option of turning off. If I want a duplicate check, at a later date I want Similarity to use the initial analysis results to do the dup check.

4. I will look forward to any improvement in stability.

You did not answer two very important questions.

During the analysis (in this case if the file/folders are not assigned to a group) are the results stored in memory to perform immediate dup checks or are they periodically written to disk to make recovery easier/ use less memory / perform faster? The latter is what I would prefer and would want options to set those preferences.

The other.. Since the music db is constanly being maintained (e.g. audio tags changed, file and folder names changed, one track replaced by another, tracks deleted, etc.), does Similarity recognize these changes (at startup? or on request) and reanalyze only those files that have experienced a  change, or does Similarity have to reanalyze the whole data base, which I have indicated could take a significant amount of time?

3
General / Similarity efficiency (speed) and recovery/restart.
« on: October 12, 2012, 21:24:24 »
I bought Similarity Premium about a year ago. I am currently running version 1.7.1.

Over the years I have been digitizing my entire LP and CD collection. In some cases I not only have the original vinyl LP, but the 1st CD pressing, and subsequent remasters. I am at the point where I would like to review similar albums, select the best audio version and make only that one available for a jukebox type application. Listening to each album and making a subjective analysis and selection is not possible. I purchased Similarity to give me a tool that would make a technical analysis of each track, provide an overall evaluation, and give me some guidance. In the end, before I remove the technically inferior versions, I listen to each track to confirm that the technical evaluation is in line with my subjective evaluation. Occasionally, a technically inferior version (eg exessive clipping, lower high frequency range) may in fact be the better sounding.

For my purposes, looking for duplicate tracks or comparing individual tracks to cherry pick the best, is not a requirement. Cobbling together a single album by selecting the technically best from several versions, results in an uneven and confusing listening experience. Neither is processing or comparing pictures a requirement.

My music collection is organized by Artist/ReleaseYear-Album Title-ReissueYear/Track. So I simply want Similarity to process each track sequentially, analyze each track, and log the result in alphabetic order (the same order it was processed in). This would result in each similar album to be logged near its predecessor and make it easy for visual comparison. Subsequent analysis using sorting of columns to identify particularly bad tracks or some sort of search and filter function would be useful.

Up to now I have been using Similarity to analyze the occasional album. I have been trying to learn how to interpret the results. I have only recently realized the relationship between the dynamic range analysis provided to me by a Foobar add-on and the Similarity Max (abs) field. I have reviewed the documentation for the Mean and Abs fields several times but still can't relate them to the listening experience.

I now want to use Similarity to process the complete collection to identify the best album versions. I am running the program on a backup computer in my home. It is an 'older' computer (older than my main computer but still sufficiently fast to provide 24/7 server functionality) that I have highly optimized for Similarity processing but it still takes about two days to process 30,000 files. The time is not a big issue. I turn Similarity on, come back in two days, and all would be good, except when an instability occurs in the program/computer occurs and I have to restart Similarity. I don't want to spend another two days to go through the whole analysis.

I have reviewed the forum for any similar experiences and came across a discussion here
http://www.similarityapp.com/forum/index.php?topic=841.0
If I understand the Admin response, Similarity analyses each file, puts results in memory, and does comparisons on the file. This naturally results in memory limitations, slowing down the process as more and more files are added, and if the results are not written to disk, all will be lost if the program fails. The point was that Similarity was not designed to process large collections.
The Admin also suggests that somehow, not all is lost, because when Similarity restarts it does not reanalyse the files in the cache.

I apologize for the long prologue to my questions but I want to make sure you understand where I am coming from'

1. While Similarity is analysing the status bar shows a total files and currently processed files count. Is this a count of audio files or also a count of images being processed?

2. The initial Similarity scan failed after processing about 15,000 files (computer blue screen of death and Similarity may not have been the cause). When I restarted Similarity, I saw no update of the cache counter. The second run that completed successfully appeared to be a little faster but that was probably because I raised the processing priority of Similarity and not due to any previously stored analysis results. I now have 30,000 files analysed but I am afraid to turn off Similarity because I don't know how to get the files into the cache. I don't want to have to rerun the complete analysis every time I start up Similarity. So how do I get the analysis stored for future viewing?

3. I selected 30,000 files as an initial run realizing the time processing constraints. I will now go in and remove bad albums, fix some file names and tags and do other file editing functions. I then want to add a new batch of albums, do a similar run against them. I want Similarity to review the already scanned files for changes, remove deleted files from its data base, re-analyse any that have been changed, ingore those that have not changed but leave them in the analyses result lists, and then proceed to analyse the newly added files, and add all changes and additions to the cache, for subsequent runs. After several runs, I expect that I should have a visible log analyses for viewing of all files in the collection. It is not clear from the response to the above forum entry, that Similarity can do this. Can it?

4. The Admin response to the forum question suggested that the processing limitations of Similarity are all due to all analyses results have to be maintained in memory resulting in slowing down and instability of the program. If that is the case, as a Similarity customer I would prefer being able to identify which files to include in the analysis (audio or image, audio and image) and whether or not a comparison should take place in real time or not. In my case I would turn off image processing and real-time comparison. This would result in only a few analysis results being in memory at a time (just enough to optimize disk writes), faster processing because all the program has to do is sequential read the file list, determine changes, deletions, additions, process as required, and write the results to disk. No need to do comparative analysis in this run. If the results are written to disk shortly after analyses is performed, and the database checkpointed properly, there would be no issue of loosing the results of a long run. Am I misinterpreting the Admins response? The suggestion is that I can't tell Similarity I don't want it to keep analysis results in memory and not doing a comparison during the analysis.
If a time saving is possible by turning off image and real-time comparison, I would offset the saving by adding the option to do a spectrum and sonogram analysis during the analysis scan, and adding that to the collection data base. The time and space to do this may not be acceptable to all Similarity users so they should be selectable options.

Pages: [1]