Guest 98 Guy Posted October 25, 2008 Posted October 25, 2008 Is there a hard drive file organizer, optimizer, "cleaner", that will a) identify multiple copies of the same file based on - same file name (but perhaps binary identical, perhaps not) - different name, but binary identical and either follow your rules (eg smart rename) or allow you to easily move them around or delete the duplicates b) identify, catalog or list the version numbers of multiple copies of the same system files (.exe, dll, etc) The goal is to use such software (instead of manual exploration and file cutting and pasting) to organize and aggregate the contents of possibly up to a few dozen hard drives. Most hard drive file organizing software seems to focus on cataloging multi-media files rather than perform the desired optimizations, file-moves, and file-reductions.
Guest philo Posted October 25, 2008 Posted October 25, 2008 Re: Is there a hard drive file organizer that will ... "98 Guy" <98@Guy.com> wrote in message news:49029404.B4BBB9F5@Guy.com... > Is there a hard drive file organizer, optimizer, "cleaner", that will > > a) identify multiple copies of the same file based on > - same file name (but perhaps binary identical, perhaps not) > - different name, but binary identical > and either follow your rules (eg smart rename) or allow you > to easily move them around or delete the duplicates > > b) identify, catalog or list the version numbers of multiple > copies of the same system files (.exe, dll, etc) > > The goal is to use such software (instead of manual exploration and file > cutting and pasting) to organize and aggregate the contents of possibly > up to a few dozen hard drives. > > Most hard drive file organizing software seems to focus on cataloging > multi-media files rather than perform the desired optimizations, > file-moves, and file-reductions. Actually, you already have the tool Just use the Windows search function and set it to scan all harddrives. Next set the file size to search for all files larger than a specified amount. When I am trying to clean up my drives I could generally care less it I have a few too many 1k txt files... but if I have (for example) three copies of SP1 for XP on my drive... I'd want to know about it (and delete them all) so I typically set the search size to 100,000 K or more.
Guest J. P. Gilliver (John) Posted October 25, 2008 Posted October 25, 2008 Re: Is there a hard drive file organizer that will ... In message <49029404.B4BBB9F5@Guy.com>, 98 Guy <98@Guy.com> writes >Is there a hard drive file organizer, optimizer, "cleaner", that will > >a) identify multiple copies of the same file based on > - same file name (but perhaps binary identical, perhaps not) > - different name, but binary identical [] Not sure if it will do anything like all you want, but I find David Taylor's FindDup - http://www.david-taylor.pwp.blueyonder.co.uk/software/disk.html#FindDuplicates - reasonably useful. It certainly lists (see the screenshot at http://www.david-taylor.pwp.blueyonder.co.uk/images/FindDupl.gif , which also shows that it can look over several drives) size, name, version number, and so on of the files it has found. I'm not sure it will find same name but not identical, but as another has said, the ordinary Windows find function will do that, if you tell it to sort by name (click on the Name column header) when it's finished searching. -- J. P. Gilliver. UMRA: 1960/<1985 MB++G.5AL(+++)IS-P--Ch+(p)Ar+T[?]H+Sh0!:`)DNAf Lada for sale - see http://www.autotrader.co.uk "Don't worry about people stealing your ideas. If your ideas are any good, you'll have to ram them down people's throats." - Howard Aiken
Guest 98 Guy Posted October 25, 2008 Posted October 25, 2008 Re: Is there a hard drive file organizer that will ... "J. P. Gilliver (John)" wrote: > Not sure if it will do anything like all you want, but I find David > Taylor's FindDup reasonably useful. It certainly lists It requires that you enter a file-spec to perform the search, and it appears to not perform a binary compare to tell you that the files are the same or different. It probably won't find copies of the same file (binary identical) that have different names. > .. find same name but not identical, but as another has said, > the ordinary Windows find function will do that, I don't want to have to perform manual searches. I'm wondering if there is a progam that can take generate an inventory of all files (and probably compute a checksum for each file) and then compare every result against every other result and allow me to easily identify and manipulate (move, rename, copy) the duplicates.
Guest philo Posted October 25, 2008 Posted October 25, 2008 Re: Is there a hard drive file organizer that will ... "J. P. Gilliver (John)" <G6JPG@soft255.demon.co.uk> wrote in message news:Dzq5e6bsZwAJFw8S@soft255.demon.co.uk... > In message <49029404.B4BBB9F5@Guy.com>, 98 Guy <98@Guy.com> writes > >Is there a hard drive file organizer, optimizer, "cleaner", that will > > > >a) identify multiple copies of the same file based on > > - same file name (but perhaps binary identical, perhaps not) > > - different name, but binary identical > [] > Not sure if it will do anything like all you want, but I find David > Taylor's FindDup - > http://www.david-taylor.pwp.blueyonder.co.uk/software/disk.html#FindDuplicates > - reasonably useful. It certainly lists (see the screenshot at > http://www.david-taylor.pwp.blueyonder.co.uk/images/FindDupl.gif > , which also shows that it can look over several drives) size, name, > version number, and so on of the files it has found. I'm not sure it > will find same name but not identical, but as another has said, the > ordinary Windows find function will do that, if you tell it to sort by > name (click on the Name column header) when it's finished searching. > -- > J. P. Gilliver. UMRA: 1960/<1985 MB++G.5AL(+++)IS-P--Ch+(p)Ar+T[?]H+Sh0!:`)DNAf > Lada for sale - see http://www.autotrader.co.uk > > "Don't worry about people stealing your ideas. If your ideas are any good, > you'll have to ram them down people's throats." - Howard Aiken Sheesh, your advice is too sensible... as your sig says (in so many words) no one will follow it <G>
Guest thanatoid Posted October 25, 2008 Posted October 25, 2008 Re: Is there a hard drive file organizer that will ... 98 Guy <98@Guy.com> wrote in news:49029404.B4BBB9F5@Guy.com: > Is there a hard drive file organizer, optimizer, "cleaner", > that will > > a) identify multiple copies of the same file based on > - same file name (but perhaps binary identical, perhaps > not) - different name, but binary identical > and either follow your rules (eg smart rename) or allow > you to easily move them around or delete the duplicates > > b) identify, catalog or list the version numbers of > multiple > copies of the same system files (.exe, dll, etc) > > The goal is to use such software (instead of manual > exploration and file cutting and pasting) to organize and > aggregate the contents of possibly up to a few dozen hard > drives. > > Most hard drive file organizing software seems to focus on > cataloging multi-media files rather than perform the > desired optimizations, file-moves, and file-reductions. A decent file manager (Total Commander or possibly even the slightly stripped-down Free Commander) will do most of these functions. (TC works forever in demo mode.) Put two drives, directories, subdirectories, or whatever, sorted by whatever in the two panes and a hit of ONE key will instantly show you duplicates, etc. Then you can work further with the results. You can show ALL the files on an entire drive/directory with 100's of subdirs/etc.) in ONE window (branching) and work with that. "Unique Filer" can search for image or non-binary dupes by content, name, size, etc. I'm not aware of a dupe search utility that will go as far as MD5'ing (although one may well exist), but TC will compare any 2 files BY content, byte by byte. Anyway, how many identically- sized files of the same name are you gonna have that you expect to be different? -- Those who cast the votes decide nothing. Those who count the votes decide everything. - Josef Stalin NB: Not only is my KF over 4 KB and growing, I am also filtering everything from discussions.microsoft and google groups, so no offense if you don't get a reply/comment unless I see you quoted in another post.
Guest philo Posted October 25, 2008 Posted October 25, 2008 Re: Is there a hard drive file organizer that will ... "thanatoid" <waiting@the.exit.invalid> wrote in message news:Xns9B427DB5FC8F8thanexit@209.197.15.184... > 98 Guy <98@Guy.com> wrote in news:49029404.B4BBB9F5@Guy.com: > > > Is there a hard drive file organizer, optimizer, "cleaner", > > that will > > > > a) identify multiple copies of the same file based on > > - same file name (but perhaps binary identical, perhaps > > not) - different name, but binary identical > > and either follow your rules (eg smart rename) or allow > > you to easily move them around or delete the duplicates > > > > b) identify, catalog or list the version numbers of > > multiple > > copies of the same system files (.exe, dll, etc) > > > > The goal is to use such software (instead of manual > > exploration and file cutting and pasting) to organize and > > aggregate the contents of possibly up to a few dozen hard > > drives. > > > > Most hard drive file organizing software seems to focus on > > cataloging multi-media files rather than perform the > > desired optimizations, file-moves, and file-reductions. > > A decent file manager (Total Commander or possibly even the > slightly stripped-down Free Commander) will do most of these > functions. (TC works forever in demo mode.) > > Put two drives, directories, subdirectories, or whatever, sorted > by whatever in the two panes and a hit of ONE key will instantly > show you duplicates, etc. Then you can work further with the > results. You can show ALL the files on an entire drive/directory > with 100's of subdirs/etc.) in ONE window (branching) and work > with that. > > "Unique Filer" can search for image or non-binary dupes by > content, name, size, etc. > > I'm not aware of a dupe search utility that will go as far as > MD5'ing (although one may well exist), but TC will compare any 2 > files BY content, byte by byte. Anyway, how many identically- > sized files of the same name are you gonna have that you expect > to be different? > > It would be easy enough to look for such utilities on Google but by the time you'd find it install it and run it... you could have done the same with Windows===> find then sorting and deleting as needed a five minute job at most
Guest 98 Guy Posted October 25, 2008 Posted October 25, 2008 Re: Is there a hard drive file organizer that will ... thanatoid wrote: > A decent file manager (Total Commander or possibly even the > slightly stripped-down Free Commander) will do most of these > functions. (TC works forever in demo mode.) I want to distill, aggregate or combine the contents of many hard drives, with many different types of files located in directories with names that don't necessarily match other drives. Because I can't connect more than 1 or 2 extra drives to a system at the same time, the first step is to perform such an aggregation on each drive individually. Then copy the contents of each drive onto a single massive drive (each into it's own directory probably) and then aggregate or combine all their contents such that there are no duplicate files. Files with the same name that are not identical (say, .PST files) would either be auto-renamed (so they can co-exist in the same destination directory) or maybe I can set a rule so that, say, if the files are word ..DOC files, then keep only the file with the most recent creation or modification date. Perhaps I wanted all .DOC files to be aggregated into a specific destination directory, and all .XLS into another. Entire directories that have identical contents can be easily identified and the duplicates removed, leaving only 1 copy (ie MS office clipart directories). Files that are identical (exact binary match) but with different names would be deleted (perhaps with manual guidance). To go one step further - it would be great to identify file-sets that have been unpacked from larger archive files and allow me to decide if I want to keep the file-sets (and delete the archive file) or vice-versa. Perhaps this task is not so useful for someone who only has a single computer in his/per possession, but for a SOHO situation where you have perhaps 5 to 10 years worth of computer use by an office with 2, 5, 10 or 20 people, you tend to build up a collection of hard drives that one day you want to organize and retrieve the contents of to make them available to others, and to wipe the original drives before donating or discarding. As well, my own working drives are littered with driver, application, utility and system files (and archive files of each type) that I've downloaded from the net over the years that I know I have multiple copies of (and multiple versions) and sorting manually through that mess would be time consuming.
Guest Franc Zabkar Posted October 25, 2008 Posted October 25, 2008 Re: Is there a hard drive file organizer that will ... On Sat, 25 Oct 2008 10:44:25 -0400, 98 Guy <98@Guy.com> put finger to keyboard and composed: >"J. P. Gilliver (John)" wrote: > >> Not sure if it will do anything like all you want, but I find David >> Taylor's FindDup reasonably useful. It certainly lists > >It requires that you enter a file-spec to perform the search, and it >appears to not perform a binary compare to tell you that the files are >the same or different. It probably won't find copies of the same file >(binary identical) that have different names. > >> .. find same name but not identical, but as another has said, >> the ordinary Windows find function will do that, > >I don't want to have to perform manual searches. I'm wondering if there >is a progam that can take generate an inventory of all files (and >probably compute a checksum for each file) and then compare every result >against every other result and allow me to easily identify and >manipulate (move, rename, copy) the duplicates. It's not exactly what you want, but I've just tried this utility: http://www.fastsum.com/download.php It has a nice DOS CLI (or you can pay $15 for the GUI version): http://www.fastsum.com/download/fsum.zip I use the DOS Sort command to sort the output, after which I extract the duplicate checksums with a QBasic program (see below). Here is an excerpt: 012C1034B8E1612EC527CB21D1B8EBCE *Program Files\Google Earth\RES\PAL5\ICON34L.PNG 012C1034B8E1612EC527CB21D1B8EBCE *Program Files\Google Earth\RES\PAL5\ICON42L.PNG 014332C7F61513329BFF2EB110EB2485 *WIN98SE\SYSBCKUP\COMPOBJ.DLL 014332C7F61513329BFF2EB110EB2485 *WIN98SE\SYSTEM\COMPOBJ.DLL 014CEC45FA7B59CE0B4639AE613A9FD3 *WIN98SE\SYSBCKUP\MSANALOG.VXD 014CEC45FA7B59CE0B4639AE613A9FD3 *WIN98SE\SYSTEM\MSANALOG.VXD I'm probably a very bad housekeeper, but on my C: drive there are 4500 duplicated files. <yikes> AIUI, FastSum can include other information about the files in its report, eg file size. You can also restrict your analysis to files of a certain type, eg *.doc. This is my quick and dirty QBasic program: ============================================== REM This program extracts duplicate files OPEN "c_md5.srt" FOR INPUT AS #1 OPEN "c_md5.lst" FOR OUTPUT AS #2 LINE INPUT #1, lin1$ WHILE NOT EOF(1) LINE INPUT #1, lin2$ len1$ = LEFT$(lin1$, 32) len2$ = LEFT$(lin2$, 32) IF len2$ <> len1$ THEN match = 0: GOTO 100 IF match = 0 THEN PRINT #2, lin1$ PRINT #2, lin2$ match = 1 100 lin1$ = lin2$ WEND CLOSE ============================================== - Franc Zabkar -- Please remove one 'i' from my address when replying by email.
Guest 98 Guy Posted October 25, 2008 Posted October 25, 2008 Re: Is there a hard drive file organizer that will ... Franc Zabkar wrote: > It's not exactly what you want, but I've just tried this utility: > http://www.fastsum.com/download.php > > I use the DOS Sort command to sort the output, after which I > extract the duplicate checksums with a QBasic program (see below). Thanks Franc. I'll try it.
Guest J. P. Gilliver (John) Posted October 26, 2008 Posted October 26, 2008 Re: Is there a hard drive file organizer that will ... In message <490330C9.4C5C5E5F@Guy.com>, 98 Guy <98@Guy.com> writes >"J. P. Gilliver (John)" wrote: > >> Not sure if it will do anything like all you want, but I find David >> Taylor's FindDup reasonably useful. It certainly lists > http://www.david-taylor.pwp.blueyonder.co.uk/software/disk.html#FindDuplicates >It requires that you enter a file-spec to perform the search, and it The filespec can be *.* (IIRR, it defaults to that). >appears to not perform a binary compare to tell you that the files are >the same or different. It probably won't find copies of the same file >(binary identical) that have different names. Oh, it does. Try it; David Taylor's utilities don't make any registry changes etc., so can be tried quite safely and simply. [] -- J. P. Gilliver. UMRA: 1960/<1985 MB++G.5AL(+++)IS-P--Ch+(p)Ar+T[?]H+Sh0!:`)DNAf Lada for sale - see http://www.autotrader.co.uk This trip should be called "Driving Miss Crazy" - Emma Wilson, on crossing the southern United States with her mother, Ann Robinson, 2003 or 2004
Guest thanatoid Posted October 26, 2008 Posted October 26, 2008 Re: Is there a hard drive file organizer that will ... "philo" <philo@privacy.net> wrote in news:#BuFxjsNJHA.4428@TK2MSFTNGP04.phx.gbl: > It would be easy enough to look for such utilities on > Google I thought I'd spare 98 Guy that statement ;-) > but by the time you'd find it > install it > and run it... > you could have done the same with Windows===> find > > then sorting and deleting as needed > > a five minute job at most Yes, but you'd have to get your fingers filthy using the Windows "tools"! Also, if you are not familiar with the TC (et al) "search" options, you really should have a look. It's like comparing a machine gun to a slingshot. And five minutes is debatable depending on how many files would be involved. -- Those who cast the votes decide nothing. Those who count the votes decide everything. - Josef Stalin NB: Not only is my KF over 4 KB and growing, I am also filtering everything from discussions.microsoft and google groups, so no offense if you don't get a reply/comment unless I see you quoted in another post.
Guest thanatoid Posted October 26, 2008 Posted October 26, 2008 Re: Is there a hard drive file organizer that will ... 98 Guy <98@Guy.com> wrote in news:49036945.F214A003@Guy.com: > thanatoid wrote: > >> A decent file manager (Total Commander or possibly even >> the slightly stripped-down Free Commander) will do most of >> these functions. (TC works forever in demo mode.) > > I want to distill, aggregate or combine the contents of > many hard drives, with many different types of files > located in directories with names that don't necessarily > match other drives. TC (or something equally good and fast at search and compare, but you'd have to look for a DOS program I'm afraid) is ideal for that. > Because I can't connect more than 1 or 2 extra drives to a > system at the same time, the first step is to perform such > an aggregation on each drive individually. Your logic is impeccable. > Then copy the contents of each drive onto a single massive > drive (each into it's its > own directory probably) and then > aggregate or combine all their contents such that there are > no duplicate files. Yup. > Files with the same name that are not identical (say, .PST > files) would either be auto-renamed (so they can co-exist > in the same destination directory) or maybe I can set a > rule so that, say, if the files are word .DOC files, then > keep only the file with the most recent creation or > modification date. Perhaps I wanted all .DOC files to be > aggregated into a specific destination directory, and all > .XLS into another. All that can be set to your preferences in TC. > Entire directories that have identical contents can be > easily identified and the duplicates removed, leaving only > 1 copy (ie MS office clipart directories). Yup. > Files that are identical (exact binary match) but with > different names would be deleted (perhaps with manual > guidance). Unique Filer will auto-delete (or /almost/ auto, confirmation req'd or something) //identical// files. I use it for images mainly, but the few times I worked with non-binary files it did a very good job. Of course, to be ABSOLUTELY sure you have to do a byte by byte check - which you can do in TC. Unfortunately, manual intervention/supervision is unavoidable - the exact same file, if just opened and looked at, may sometimes save with a difference of 1 byte depending on whether there is a space or a carriage return at the end, or depending on how Windows feels like that particular day. > To go one step further - it would be great to identify > file-sets that have been unpacked from larger archive files > and allow me to decide if I want to keep the file-sets (and > delete the archive file) or vice-versa. Yup. > Perhaps this task is not so useful for someone who only has > a single computer in his/per possession, but for a SOHO > situation where you have perhaps 5 to 10 years worth of > computer use by an office with 2, 5, 10 or 20 people, you > tend to build up a collection of hard drives that one day > you want to organize and retrieve the contents of to make > them available to others, and to wipe the original drives > before donating or discarding. Ever heard of a network? Even I managed to put one together for about a dozen computers, with a server with daily DAT backup. If I can do it, anyone can. (OK, someone else did the wiring ;-) > As well, my own working drives are littered with driver, > application, utility and system files (and archive files of > each type) that I've downloaded from the net over the years > that I know I have multiple copies of (and multiple > versions) and sorting manually through that mess would be > time consuming. Well, perhaps you should have devised a well-organized storage/duplicate-avoidance plan first then? I have a separate partition for DL'd programs and with one key hit TC shows me if there are any duplicates after some time has gone by and I can't remember what's on there anymore. (Needless to say, it's all properly categorized etc.) So would Unique Filer (which, BTW, I have mentioned because I have tried about 5 dupe finders - including very recent ones - and UF is by far the best). Why is this thread dragging on for so long when it was all clear from your first post? Are you looking for someone to do the job for you? -- Those who cast the votes decide nothing. Those who count the votes decide everything. - Josef Stalin NB: Not only is my KF over 4 KB and growing, I am also filtering everything from discussions.microsoft and google groups, so no offense if you don't get a reply/comment unless I see you quoted in another post.
Guest Franc Zabkar Posted October 26, 2008 Posted October 26, 2008 Re: Is there a hard drive file organizer that will ... On Sat, 25 Oct 2008 12:43:40 +0100, "J. P. Gilliver (John)" <G6JPG@soft255.demon.co.uk> put finger to keyboard and composed: >In message <49029404.B4BBB9F5@Guy.com>, 98 Guy <98@Guy.com> writes >>Is there a hard drive file organizer, optimizer, "cleaner", that will >> >>a) identify multiple copies of the same file based on >> - same file name (but perhaps binary identical, perhaps not) >> - different name, but binary identical >[] >Not sure if it will do anything like all you want, but I find David >Taylor's FindDup - >http://www.david-taylor.pwp.blueyonder.co.uk/software/disk.html#FindDuplicates >- reasonably useful. It certainly lists (see the screenshot at >http://www.david-taylor.pwp.blueyonder.co.uk/images/FindDupl.gif >, which also shows that it can look over several drives) size, name, >version number, and so on of the files it has found. I'm not sure it >will find same name but not identical, but as another has said, the >ordinary Windows find function will do that, if you tell it to sort by >name (click on the Name column header) when it's finished searching. That looks like a nice program but I've been running it all day and it's still only a fraction of the way through the comparisons. However that is probably a reflection on my poor housekeeping. In any case it seems to me that the author would benefit greatly by using 98Guy's approach, ie calculating and comparing MD5 checksums. IIRC, FastSum took less than 30 minutes on my 450MHz box. - Franc Zabkar -- Please remove one 'i' from my address when replying by email.
Guest philo Posted October 26, 2008 Posted October 26, 2008 Re: Is there a hard drive file organizer that will ... "thanatoid" <waiting@the.exit.invalid> wrote in message news:Xns9B43146A65C9Athanexit@209.197.15.184... > "philo" <philo@privacy.net> wrote in > news:#BuFxjsNJHA.4428@TK2MSFTNGP04.phx.gbl: > > > > It would be easy enough to look for such utilities on > > Google > > I thought I'd spare 98 Guy that statement ;-) > > > but by the time you'd find it > > install it > > and run it... > > you could have done the same with Windows===> find > > > > then sorting and deleting as needed > > > > a five minute job at most > > Yes, but you'd have to get your fingers filthy using the Windows > "tools"! > > Also, if you are not familiar with the TC (et al) "search" > options, you really should have a look. It's like comparing a > machine gun to a slingshot. > > And five minutes is debatable depending on how many files would > be involved. > > > Yes, I suppose if one wanted to clean up all the small files that I never bother with. I only care about the huge files that I've forgotten about such as old service packs or entire Linux distros that I've downloaded years ago. It only takes a few minutes to clean up all the big stuff
Guest 98 Guy Posted October 26, 2008 Posted October 26, 2008 Re: Is there a hard drive file organizer that will ... thanatoid wrote: > > It would be easy enough to look for such utilities on > > Google > > I thought I'd spare 98 Guy that statement ;-) What you get are tons and tons of programs that will "catalog" your files (especially multi-media files - music, movies, etc). > > you could have done the same with Windows===> find > > > > then sorting and deleting as needed Given perhaps several hundred thousand files on any given drive, multiply that by 1 to 2 dozen drives, and you're going to spend hours rounding up, sorting, comparing, and aggregating all the files you want from all of them. Perhaps you still don't understand what I'm trying to do.
Guest 98 Guy Posted October 26, 2008 Posted October 26, 2008 Re: Is there a hard drive file organizer that will ... thanatoid wrote: > > but for a SOHO situation where you have perhaps 5 to 10 > > years worth of computer use by an office with 2, 5, 10 > > or 20 people, you tend to build up a collection of hard > > drives that one day you want to organize and retrieve > > the contents of to make them available to others, and > > to wipe the original drives before donating or discarding. > > Ever heard of a network? What's that got to do with the paragraph above? If I have 10 copies of the same .xls or .doc file spread across 10 hard drives, putting all those drives on a network isin't going to change the fact that there are 10 copies of the same file accessible to everyone on the network, instead of just one copy.
Guest J. P. Gilliver (John) Posted October 26, 2008 Posted October 26, 2008 Re: Is there a hard drive file organizer that will ... In message <sa78g4hh81gdhr2k51vndb81dm429id32l@4ax.com>, Franc Zabkar <fzabkar@iinternode.on.net> writes [] >>http://www.david-taylor.pwp.blueyonder.co.uk/software/disk.html#FindDuplicates [] >That looks like a nice program but I've been running it all day and >it's still only a fraction of the way through the comparisons. However That is the problem I've found with it; it also slows down the PC a bit when it's been running a while. The solution is just to hit the stop button; it checks files in descending order of size, so by the time it has slowed to a crawl, it will have compared the large files. (I've seen it spend ages comparing a 44 byte file!) When you hit the stop button, you _don't_ lose what it has found so far. Once you've dealt with those, you can set it going again, and (assuming you've not _left_ big duplicates in place), it will start with the remaining duplicates, back at its higher starting speed. EasyCleaner, from http://personal.inet.fi/business/toniarts/ecleane.htm (which is a free set of utilities I think anyone should have anyway) includes a duplicate finder which I think uses the same engine as FindDup, but starts with the littlest files. (I have a feeling it might not have the slowdown, either.) >that is probably a reflection on my poor housekeeping. In any case it >seems to me that the author would benefit greatly by using 98Guy's >approach, ie calculating and comparing MD5 checksums. IIRC, FastSum >took less than 30 minutes on my 450MHz box. [] For finding "what's eating my disc", I haven't come across anything to beat Steffen Gerlach's Scanner, from http://www.steffengerlach.de/freeware/ ; this is what I can only describe as a hierarchical piecharter, and you should try it. Of course, it must be rubbish, as it's only a 164K download ... There's also a piecharter in David Taylor's area (same page as FindDup IIRR), and as part of EasyCleaner (again, I think uses David Taylor's code), and you can go up and down the levels in those, but I'm not aware of anything that has a hierarchical display like Scanner. -- J. P. Gilliver. UMRA: 1960/<1985 MB++G.5AL(+++)IS-P--Ch+(p)Ar+T[?]H+Sh0!:`)DNAf Lada for sale - see http://www.autotrader.co.uk This trip should be called "Driving Miss Crazy" - Emma Wilson, on crossing the southern United States with her mother, Ann Robinson, 2003 or 2004
Guest J. P. Gilliver (John) Posted October 26, 2008 Posted October 26, 2008 Re: Is there a hard drive file organizer that will ... In message <Xns9B4316F259D3Dthanexit@209.197.15.184>, thanatoid <waiting@the.exit.invalid> writes [] >> Then copy the contents of each drive onto a single massive >> drive (each into it's > >its [] See you in the APIHNA newsgroup ... (-: -- J. P. Gilliver. UMRA: 1960/<1985 MB++G.5AL(+++)IS-P--Ch+(p)Ar+T[?]H+Sh0!:`)DNAf Lada for sale - see http://www.autotrader.co.uk This trip should be called "Driving Miss Crazy" - Emma Wilson, on crossing the southern United States with her mother, Ann Robinson, 2003 or 2004
Guest MEB Posted October 26, 2008 Posted October 26, 2008 Re: Is there a hard drive file organizer that will ... "98 Guy" <98@Guy.com> wrote in message news:49047427.19372C62@Guy.com... | thanatoid wrote: | | > > It would be easy enough to look for such utilities on | > > Google | > | > I thought I'd spare 98 Guy that statement ;-) | | What you get are tons and tons of programs that will "catalog" your | files (especially multi-media files - music, movies, etc). | | > > you could have done the same with Windows===> find | > > | > > then sorting and deleting as needed | | Given perhaps several hundred thousand files on any given drive, | multiply that by 1 to 2 dozen drives, and you're going to spend hours | rounding up, sorting, comparing, and aggregating all the files you want | from all of them. | | Perhaps you still don't understand what I'm trying to do. I think I get what you're thinking about, but when you bring in corporate structure and potential issues {as you did previously} then that should be handled by the server/network setup, group policies, synchronization aspects, and other network possibilities.. Your question IS viable for a user who has failed to apply sensible usage or even a small network without a centralized server or more, but fails to address and understand how large networks ensure these things do NOT occur. If your company or another has these types of difficulties then you/they need to re-think your/their networking setup PARTICULARLY when using Microsoft servers and OSs. ANY user who fails to emplace some form of directory/file type/specific activity/temp folders-master files/synchronization policies/etc. system will ALWAYS end of with tons of JUNK. So if you actually think about it, what you're asking is for ANOTHER third party program to use to correct your own failure to properly setup your own usage. IF you're referring to yourself and with issues with dozens of drives, then I would question why you haven't been applying your own method of removing old files and multiple duplicates on a regular basis. Even 98 had synchronization abilities.... moreover, I would question why you don't use CDROMS and/or DVDs and a multiple burner drives rather than HDs, seems like a tremendous waste of money and a poorly thought out usage of those resources. If you're swapping Hard Drives, then why haven't you labeled them for SPECIFIC usage. -- MEB http://peoplescounsel.org a Peoples' counsel _ _ ~~
Guest MEB Posted October 26, 2008 Posted October 26, 2008 Re: Is there a hard drive file organizer that will ... "98 Guy" <98@Guy.com> wrote in message news:49047427.19372C62@Guy.com... | thanatoid wrote: | | > > It would be easy enough to look for such utilities on | > > Google | > | > I thought I'd spare 98 Guy that statement ;-) | | What you get are tons and tons of programs that will "catalog" your | files (especially multi-media files - music, movies, etc). | | > > you could have done the same with Windows===> find | > > | > > then sorting and deleting as needed | | Given perhaps several hundred thousand files on any given drive, | multiply that by 1 to 2 dozen drives, and you're going to spend hours | rounding up, sorting, comparing, and aggregating all the files you want | from all of them. | | Perhaps you still don't understand what I'm trying to do. I think I get what you're thinking about, but when you bring in corporate structure and potential issues {as you did previously} then that should be handled by the server/network setup, group policies, synchronization aspects, and other network possibilities.. Your question IS viable for a user who has failed to apply sensible usage or even a small network without a centralized server or more, but fails to address and understand how large networks ensure these things do NOT occur. If your company or another has these types of difficulties then you/they need to re-think your/their networking setup PARTICULARLY when using Microsoft servers and OSs. ANY user who fails to emplace some form of directory/file type/specific activity/temp folders-master files/synchronization policies/etc. system will ALWAYS end of with tons of JUNK. So if you actually think about it, what you're asking is for ANOTHER third party program to use to correct your own failure to properly setup your own usage. IF you're referring to yourself and with issues with dozens of drives, then I would question why you haven't been applying your own method of removing old files and multiple duplicates on a regular basis. Even 98 had synchronization abilities.... moreover, I would question why you don't use CDROMS and/or DVDs and a multiple burner drives rather than HDs, seems like a tremendous waste of money and a poorly thought out usage of those resources. If you're swapping Hard Drives, then why haven't you labeled them for SPECIFIC usage. -- MEB http://peoplescounsel.org a Peoples' counsel _ _ ~~
Guest thanatoid Posted October 26, 2008 Posted October 26, 2008 Re: Is there a hard drive file organizer that will ... 98 Guy <98@Guy.com> wrote in news:4904757E.5C19DCA6@Guy.com: > thanatoid wrote: > >> > but for a SOHO situation where you have perhaps 5 to 10 >> > years worth of computer use by an office with 2, 5, 10 >> > or 20 people, you tend to build up a collection of hard >> > drives that one day you want to organize and retrieve >> > the contents of to make them available to others, and >> > to wipe the original drives before donating or >> > discarding. >> >> Ever heard of a network? > > What's that got to do with the paragraph above? > > If I have 10 copies of the same .xls or .doc file spread > across 10 hard drives, putting all those drives on a > network isin't going to change the fact that there are 10 > copies of the same file accessible to everyone on the > network, instead of just one copy. Your understanding of networks in not nearly as impeccable as your logic of finding duplicates on one drive on which I commented in my previous reply. In any case, discussing the problem endlessly is not going to make it go away, so either get to work, or quit your job. You have been given all the info short of the phone number of someone who will do it for you for free. It's a nasty job, I admit. I suggest better planning next time (it may not have been YOU that set things up this way, but it appears to be in your lap now) and there is obviously little control over what individuals do in that place. There are various ways of dealing with such problems, and a network server (which, BTW would contain ONE copy of each file which people can access and work on as they need to) is one of starting points. Careful supervision of what people actually DO on their individual workstations and their qualifications is the second. If you want this to actually be a success you may have to tell everybody to go on vacation for a while so they don't instantly mess up every little bit you've managed to do. Reminds me of when I worked for a lunatic who wouldn't ever use a pen, he ONLY used pencils - for when he made his endless mistakes/corrections. -- Those who cast the votes decide nothing. Those who count the votes decide everything. - Josef Stalin NB: Not only is my KF over 4 KB and growing, I am also filtering everything from discussions.microsoft and google groups, so no offense if you don't get a reply/comment unless I see you quoted in another post.
Guest Franc Zabkar Posted October 26, 2008 Posted October 26, 2008 Re: Is there a hard drive file organizer that will ... On Sun, 26 Oct 2008 13:56:34 +0000, "J. P. Gilliver (John)" <G6JPG@soft255.demon.co.uk> put finger to keyboard and composed: >In message <sa78g4hh81gdhr2k51vndb81dm429id32l@4ax.com>, Franc Zabkar ><fzabkar@iinternode.on.net> writes >[] >>>http://www.david-taylor.pwp.blueyonder.co.uk/software/disk.html#FindDuplicates >[] >>That looks like a nice program but I've been running it all day and >>it's still only a fraction of the way through the comparisons. However > >That is the problem I've found with it; it also slows down the PC a bit >when it's been running a while. The solution is just to hit the stop >button; it checks files in descending order of size, so by the time it >has slowed to a crawl, it will have compared the large files. (I've seen >it spend ages comparing a 44 byte file!) When you hit the stop button, >you _don't_ lose what it has found so far. Once you've dealt with those, >you can set it going again, and (assuming you've not _left_ big >duplicates in place), it will start with the remaining duplicates, back >at its higher starting speed. > >EasyCleaner, from http://personal.inet.fi/business/toniarts/ecleane.htm >(which is a free set of utilities I think anyone should have anyway) >includes a duplicate finder which I think uses the same engine as >FindDup, but starts with the littlest files. (I have a feeling it might >not have the slowdown, either.) AFAICS, a fundamental flaw in duplicate finder software is that it relies on direct binary comparisons. With programs like FindDup, if we have 3 files of equal size, then we would need to compare file1 with file2, file1 with file3, and file2 with file3. This requires 6 reads. For n equally sized files, the number of reads is n(n-1). Alternatively, if we relied on MD5 checksums, then each file would only need to be read once. >>that is probably a reflection on my poor housekeeping. In any case it >>seems to me that the author would benefit greatly by using 98Guy's >>approach, ie calculating and comparing MD5 checksums. IIRC, FastSum >>took less than 30 minutes on my 450MHz box. >[] >For finding "what's eating my disc", I haven't come across anything to >beat Steffen Gerlach's Scanner, from >http://www.steffengerlach.de/freeware/ >; this is what I can only describe as a hierarchical piecharter, and you >should try it. Of course, it must be rubbish, as it's only a 164K >download ... There's also a piecharter in David Taylor's area (same page >as FindDup IIRR), and as part of EasyCleaner (again, I think uses David >Taylor's code), and you can go up and down the levels in those, but I'm >not aware of anything that has a hierarchical display like Scanner. I *love* small utility software. At the moment I'm playing with Windows CE in a small GPS device. It reminds me what can be done with a small amount of resources, eg a 16KB calculator, 23KB task manager, 6.5KB screen capture utility. - Franc Zabkar -- Please remove one 'i' from my address when replying by email.
Guest FromTheRafters Posted October 26, 2008 Posted October 26, 2008 Re: Is there a hard drive file organizer that will ... > AFAICS, a fundamental flaw in duplicate finder software is that it > relies on direct binary comparisons. With programs like FindDup, if we > have 3 files of equal size, then we would need to compare file1 with > file2, file1 with file3, and file2 with file3. This requires 6 reads. > For n equally sized files, the number of reads is n(n-1). > > Alternatively, if we relied on MD5 checksums, then each file would > only need to be read once. So...once it is found to be the same checksum, what should the program do next? How important are these files? A fundamental flaw would be to trust MD5 checksums as an indication that the files are indeed duplicates. You can mostly trust MD5 checksums to indicate two files are different, but the other way around?
Guest Bill in Co. Posted October 26, 2008 Posted October 26, 2008 Re: Is there a hard drive file organizer that will ... FromTheRafters wrote: >> AFAICS, a fundamental flaw in duplicate finder software is that it >> relies on direct binary comparisons. With programs like FindDup, if we >> have 3 files of equal size, then we would need to compare file1 with >> file2, file1 with file3, and file2 with file3. This requires 6 reads. >> For n equally sized files, the number of reads is n(n-1). >> >> Alternatively, if we relied on MD5 checksums, then each file would >> only need to be read once. > > So...once it is found to be the same checksum, what should the > program do next? How important are these files? > A fundamental > flaw would be to trust MD5 checksums as an indication that the > files are indeed duplicates. Since when? What is the statistical likelyhood of that being true? > You can mostly trust MD5 checksums > to indicate two files are different, but the other way around?
Recommended Posts