PDA

Archiv verlassen und diese Seite im Standarddesign anzeigen : Interesse an dupmerge?



nobody0
28-11-2004, 23:14
Ich habe mal das Programm dupmerge ( http://freshmeat.net/projects/dupmerge/?branch_id=2096&release_id=6532 ) etwas überarbeitet und erweitert, so dass es gleiche Dateien nicht nur hart linkt (und damit jeweils eine Datei an Plattenplatz spart), sondern auch hard links expandieren kann (s. unten).
Ich habe mal den Autor angemailt, aber keine Antwort bekommen, so dass ich überlege dupmerge auf sourceforge zu packen, damit es da weiterentwickelt werden kann; die todo-list ist ja noch lang.

Gibt's hierfür Interesse? :confused:

Ich habe mit dem Programm gut 10 % Plattenplatz gespart in den Verzeichnissen vom FTP-Server und im Archiv mit diversen Downloads und auf Platte kopierten CDs/DVDs.
Das ist nicht viel, aber bedeutet immerhin im Durchschnitt gut 10 % Platten-Mehrwert durch ein kostenloses Programm.
Ich benutze dupmerge auch um Backups von home-Verzeichnissen zu Komprimieren und Expandieren; damit kann ich das Backup zusätzlich (zur Kompression mit zip/bzip) um gut 20 % verkleinern.



/* Dupmerge - Reclaim disk space by linking identical files together

This is a utility that scans a UNIX directory tree looking for pairs of
distinct files with identical content. When it finds such files, it
deletes one file to reclaim its disk space and then recreates its path
name as a link to the other copy.
My first version of this program circa 1993 worked by computing MD5
hashes of every file, sorting the hashes and then looking for duplicates.
This worked, but it was unnecessarily slow. The comparison function I use
now stops comparing two files as soon as it determines their lengths are
different, which is a win when you have many large files with unique lengths.

* This program reads from standard input a list of files (such
* as that generated by "find . -print") and discovers which files are
* identical. Dupmerge unlinks one file of each identical pair and
* recreates its path name as a link to the other.
*
* Non-plain files in the input (directories, pipes, devices, etc)
* are ignored. Identical files must be on the same file system to be linked.
*
* Dupmerge prefers to keep the older of two identical files, as the older
* timestamp is more likely to be the correct one given that many
* copy utilities (e.g., 'cp') do not by default preserve modification
* times.
*
* Dupmerge works by quicksorting a list of path names, with the
* actual unlinking and relinking steps performed as side effects of
* the comparison function. The results of the sort are discarded.
*
* Command line arguments:
* -n Suppress the actual unlinking and relinking
* -q Operate in quiet mode (otherwise, relinks are displayed on stdout)
*
* 12 February 1998 Phil Karn, karn@ka9q.ampr.org
* Copyright Phil Karn. May be used under the terms of the GNU Public License.


--------------------------------------------------------------------------------


For files bigger than 2 GiB, > 2^31 Byte, you should use the compiler option
"-D_FILE_OFFSET_BITS=64".

Example for compilation (and striping + prelinking) on i686 ("PC" with Pentium III or higher or Athlon)
with shell function c (in ~/.bashrc):
function c {
gcc -Wall -I. -O3 -D_GNU_SOURCE -D__SMP__ -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_REENTRANT \
-pthread -mcpu=i686 -fexpensive-optimizations -DCPU=686 -ffast-math -m486 -lm -lz -o $1 $1.c && strip $1 \
&& prelink -fmRv ./$1
}

Example usage of this shell function (in ~/.bashrc): c dupmerge


Added swap macro, invers mode, no replacing of zero length files, void casts for
semantic checkers, deleted unnecessary second run after qsort, ...
Switched to C99.
Tested with SuSE 9.2 and Debian (both Kernel 2.6), tested with 4 and 7 Gigabyte files,
Rolf Freitag, 2004.


The inverse mode expands the hard links to simple files if used without the option -n.
With option -n hard links are only reported.
The reverse mode can be used e. g. for expanding backups, which have been shrinked with
with the default mode.

Caution: If there is not enough space for expanding the hard links (in inverse mode) files (links)
will be lost!


example for usage:
find ./ -type f -print | dupmerge 2>&1 | tee /tmp/user0/dupmerge.out


Todo: - better inverse mode which works with whitespaces in filenames
- ANSI-C comppatible error handling of the comparison functions
- for option n: normal mode: show the number of blocks which can be saved,
inverse mode: show the number of blocks which must be spend; the actual values are only a
much too high approximation
- option a, h and -help for advice with example usage and all allowed options
- error message +advice when wrong parameters are used
- option f for using a soft link as second try when a hard link could not be made and the reverse for
inverse mode (expand hard and soft links)
- option s for only soft linking (instead of the hard linking) and the reverse for inverse mode
- option e for deleting (erasing, not linking) all duplicate files
- more user friendly output with e. g. freed block + saved space in kiB,
extra explanation in output when -n is used,
- tests with ntfs partiotions for MS-Win-users

*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h> // Nessisary for unlink, gtopt, link, fcmp. Rolf Freitag.
#include <iso646.h> // for and, bitand, or, bitor ...
#include <stdbool.h>

// Swap two items a, b of size i_size, use (static) swap variable c for performance.
#define mc_SWAP_ITEMS(a, b, c, i_size) { memcpy (&(c), &(a), (i_size));\
memcpy (&(a), &(b), (i_size));\
memcpy (&(b), &(c), (i_size)); }

// easter egg
#define mc_ADVICE if (argc > 1 && 0 == strncmp (argv[1], "-advice", 10) ) \
{ \
(void)printf ("Don't Panic!\n"); \
exit (42); \
} /* 42: The meaning of life, the universe, and everything. */


// functions for qsort
int fcmp (const void *, const void *);
int fcmp1 (const void *, const void *);


struct stat s_swap; // swap variable
int Nodo = 0;
int Quiet = 0;
_Bool b_inv = false; // invers mode indicator
int Files_deleted = 0;
int Blocks_reclaimed = 0;
int i_files_expanded = 0;
int i_blocks_declaimed = 0;


int
main (int argc, char *argv[])
{
char **names, buf[BUFSIZ], *cp; // BUFSIZE is the default buffer size in stdio.h, usually 8192
int nfiles, i;
FILE *tmp;

mc_ADVICE;
// read parameters for quiet mode and suppressing the actual unlinking and relinking. Rolf Freitag
while ((i = getopt (argc, argv, "inq")) != EOF)
{
switch (i)
{
case 'n':
Nodo = 1;
break;
case 'q':
Quiet = 1;
break;
case 'i':
b_inv = true;
default:
break;
}
}

// Read list of file names into temp file and count
tmp = tmpfile ();
if (NULL == tmp)
{
(void) fprintf (stderr, "could not open temporary file, exiting\n");
exit (-1);
}
nfiles = 0;
while (fgets (buf, sizeof (buf), stdin), !feof (stdin))
{
nfiles++;
if (EOF == fputs (buf, tmp))
{
(void) fprintf (stderr, "could not write to temporary file, exiting\n");
exit (-1);
}
}
//Now that we know how many there are, allocate space and re-read */
rewind (tmp);
if ((names = (char **) calloc (nfiles, sizeof (char *))) == NULL) // malloc changed to safer calloc.
{
(void) fprintf (stderr, "%s: Out of memory\n", argv[0]);
exit (1);
}
for (i = 0; i < nfiles; i++)
{
(void) fgets (buf, sizeof (buf), tmp);
if ((cp = strchr (buf, '\n')) != NULL)
*cp = '\0'; // Chomp newlines
if ((names[i] = strndup (buf, BUFSIZ)) == NULL) // replaced strdup by safer strndup
{
(void) fprintf (stderr, "%s: Out of memory\n", argv[0]);
exit (1);
}
}
(void) fclose (tmp);
if (not b_inv)
{
qsort (names, nfiles, sizeof (char *), fcmp);
// if (!Quiet)
// (void) printf ("Scanning for more dups...\n");
// for (i = 0; i < nfiles - 2; i++)
// (void) fcmp (names + i, names + i + 1);
if (!Quiet)
(void) printf ("Files deleted: %d, Disk blocks reclaimed: %d\n", Files_deleted, Blocks_reclaimed);
}
else // inverse mode
{
qsort (names, nfiles, sizeof (char *), fcmp1);
// if (!Quiet)
// (void) printf ("Scanning for more links...\n");
// for (i = 0; i < nfiles - 2; i++)
// (void) fcmp1 (names + i, names + i + 1);
if (!Quiet)
(void) printf ("Links expanded: %d, Disk blocks declaimed: %d\n", i_files_expanded, i_blocks_declaimed);
}
exit (0);
} // main



// This is the comparison function called by qsort, where the real work
// is done as a side effect. Due to ANSI-C the current version is not
// correct because if the same objects are passed more than one to the
// comparison function the results must be consistent with another, 7.20.5 C99.
// If errors like lstat failed do occur, this is not fullfilled yet.
int
fcmp (const void *a, const void *b)
{
struct stat sa, sb;
FILE *fa, *fb;
int c1, c2;
int rval = 0;
const char *filea, *fileb;

// Nonexistent or non-plain files are less than any other file
if (NULL == a)
return -1;
filea = *(const char **) a;
if ((-1 == lstat (filea, &sa)) or ! S_ISREG (sa.st_mode))
{
fprintf (stderr, "lstat(%s) failed\n", filea);
return -1;
}

if (NULL == b)
return 1;
fileb = *(const char **) b;
if ((-1 == lstat (fileb, &sb)) or ! S_ISREG (sb.st_mode))
{
fprintf (stderr, "lstat(%s) failed\n", fileb);
return 1;
}
// Smaller files are "less"
if (sa.st_size < sb.st_size)
return -1;
if (sa.st_size > sb.st_size)
return 1;
if ((sa.st_dev == sb.st_dev) and (sa.st_ino == sb.st_ino))
return 0; // Files are hard linked

// We now know both files exist, are plain files, are the same size,
// and are not already linked, so compare their contents
if (NULL == (fa = fopen (filea, "r")))
return -1; // Unreadable files are "less than"
if (NULL == (fb = fopen (fileb, "r")))
{
fclose (fa);
return 1;
}
while (((c1 = fgetc (fa)) != EOF) and ((c2 = fgetc (fb)) != EOF)) // compare the two files
{
if (c1 < c2)
{
rval = -1;
break;
}
else
{
if (c1 > c2)
{
rval = 1;
break;
}
}
}
(void) fclose (fa);
(void) fclose (fb);

if ((0 == rval) and (sa.st_dev == sb.st_dev) and (sa.st_size > 0))
{
// Files are identical, have size > 0 and are on the same device, so link them.
// We prefer to keep the older copy, or if they're the
// same date, the one with more (hard) links
if ((sb.st_mtime > sa.st_mtime) or (sb.st_nlink < sa.st_nlink)) // swap items to keep original sb
mc_SWAP_ITEMS (sa, sb, s_swap, sizeof (struct stat));
if (1 == sa.st_nlink)
{
Files_deleted++;
Blocks_reclaimed += sa.st_blocks;
}
if (!Nodo and (unlink (filea)))
{
(void) fprintf (stderr, "unlink(%s) failed\n", filea);
perror ("unlink");
exit (1);
}
if (!Quiet)
(void) printf ("ln %s %s: %d->%d, %d->%d\n", fileb, filea, sb.st_nlink, sb.st_nlink + 1, sa.st_nlink, sa.st_nlink - 1);
if (!Nodo and (-1 == link (fileb, filea)))
{
(void) fprintf (stderr, "link(%s,%s) failed\n", fileb, filea);
perror ("link");
exit (1);
}
}
return rval;
} // fcmp



// This is the second comparison function for inverse mode
// compare: if both files have the same inode on same devide:
// delete younger and copy oder the place of the deleted with the
// name of the deleted.
int
fcmp1 (const void *a, const void *b)
{
struct stat sa, sb;
// FILE *fa, *fb;
// int c1, c2;
int rval = 0;
const char *filea, *fileb;
char cline[0xfff] = { '\0' }; // command line

// Nonexistent or non-plain files are less than any other file
if (NULL == a)
return -1;
filea = *(const char **) a;
if ((-1 == lstat (filea, &sa)) or ! S_ISREG (sa.st_mode))
{
fprintf (stderr, "lstat(%s) failed\n", filea);
return -1;
}
if (NULL == b)
return 1;
fileb = *(const char **) b;
if ((-1 == lstat (fileb, &sb)) or ! S_ISREG (sb.st_mode))
{
fprintf (stderr, "lstat(%s) failed\n", fileb);
return 1;
}
// Smaller files are "less"
if (sa.st_size < sb.st_size)
return -1;
if (sa.st_size > sb.st_size)
return 1;
if ((sa.st_dev == sb.st_dev) and (sa.st_ino == sb.st_ino))
{ // Files are hard linked
// We prefer to keep the older copy
if (sb.st_mtime > sa.st_mtime) // swap items to keep original sb
mc_SWAP_ITEMS (sa, sb, s_swap, sizeof (struct stat));
i_files_expanded++;
i_blocks_declaimed += sa.st_blocks;
if (!Nodo) // expand the hard link
{
if (remove (filea)) // remove didn't worked
{
(void) fprintf (stderr, "remove(%s) failed\n", filea);
perror ("unlink");
exit (1);
}
snprintf (cline, sizeof (cline) - 1, "cp -a %s %s", fileb, filea);
if (system (cline))
{
(void) fprintf (stderr, "expanding the hard link(%s) failed; lost %s\n", filea, filea);
perror ("expand");
exit (1);
}
if (!Quiet)
(void) printf ("expand %s %s: %d->%d\n", fileb, filea, sb.st_nlink, sb.st_nlink - 1);
}
}
return rval;
} // fcmp1

nobody0
29-11-2004, 10:52
Hier nun die zweite Version, ohne bekannte Bugs ist und auch mit Leerzeichen in Dateinamen keine Probleme hat:



/* Dupmerge - Reclaim disk space by linking identical files together

This is a utility that scans a UNIX directory tree looking for pairs of
distinct files with identical content. When it finds such files, it
deletes one file to reclaim its disk space and then recreates its path
name as a link to the other copy.
My first version of this program circa 1993 worked by computing MD5
hashes of every file, sorting the hashes and then looking for duplicates.
This worked, but it was unnecessarily slow. The comparison function I use
now stops comparing two files as soon as it determines their lengths are
different, which is a win when you have many large files with unique lengths.

* This program reads from standard input a list of files (such
* as that generated by "find . -print") and discovers which files are
* identical. Dupmerge unlinks one file of each identical pair and
* recreates its path name as a link to the other.
*
* Non-plain files in the input (directories, pipes, devices, etc)
* are ignored. Identical files must be on the same file system to be linked.
*
* Dupmerge prefers to keep the older of two identical files, as the older
* timestamp is more likely to be the correct one given that many
* copy utilities (e.g., 'cp') do not by default preserve modification
* times.
*
* Dupmerge works by quicksorting a list of path names, with the
* actual unlinking and relinking steps performed as side effects of
* the comparison function. The results of the sort are discarded.
*
* Command line arguments:
* -n Suppress the actual unlinking and relinking
* -q Operate in quiet mode (otherwise, relinks are displayed on stdout)
*
* 12 February 1998 Phil Karn, karn@ka9q.ampr.org
* Copyright Phil Karn. May be used under the terms of the GNU Public License.


--------------------------------------------------------------------------------


For files bigger than 2 GiB, > 2^31 Byte, you should use the compiler option
"-D_FILE_OFFSET_BITS=64".

Example for compilation (and striping + prelinking) on i686 ("PC" with Pentium III or higher or Athlon)
with shell function c (in ~/.bashrc):
function c {
gcc -Wall -I. -O3 -D_GNU_SOURCE -D__SMP__ -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_REENTRANT \
-pthread -mcpu=i686 -fexpensive-optimizations -DCPU=686 -ffast-math -m486 -lm -lz -o $1 $1.c && strip $1 \
&& prelink -fmRv ./$1
}

Example usage of this shell function (in ~/.bashrc): c dupmerge


Added swap macro, invers mode, no replacing of zero length files, void casts for
semantic checkers, deleted unnecessary second run after qsort, ...
Switched to C99.
Tested with SuSE 9.2 and Debian (both Kernel 2.6), tested with 4 and 7 Gigabyte files,
Rolf Freitag, 2004.


The inverse mode expands the hard links to simple files if used without the option -n.
With option -n hard links are only reported.
The reverse mode can be used e. g. for expanding backups, which have been shrinked with
with the default mode.

Caution: If there is not enough space for expanding the hard links (in inverse mode) files (links)
will be lost!


example for usage:
find ./ -type f -print | dupmerge 2>&1 | tee /tmp/user0/dupmerge.out


Todo: - inverse mode: expand with the same file attributes (chmod)
- ANSI-C comppatible error handling of the comparison functions
- for option n: normal mode: show the number of blocks which can be saved,
inverse mode: show the number of blocks which must be spend; the actual values are only a
too high preview values
- option a, h and -help for advice with example usage and all allowed options
- error message +advice when wrong parameters are used
- option f for using a soft link as second try when a hard link could not be made and the reverse for
inverse mode (expand hard and soft links)
- option s for only soft linking (instead of the hard linking) and the reverse for inverse mode
- option e for deleting (erasing, not linking) all duplicate files
- more user friendly output with e. g. freed block + saved space in kiB,
extra explanation in output when -n is used,
- tests with ntfs partitions for MS-Win-users

*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h> // Nessisary for unlink, gtopt, link, fcmp. Rolf Freitag.
#include <iso646.h> // for and, bitand, or, bitor ...
#include <stdbool.h>

// Swap two items a, b of size i_size, use (static) swap variable c for performance.
#define mc_SWAP_ITEMS(a, b, c, i_size) { memcpy (&(c), &(a), (i_size));\
memcpy (&(a), &(b), (i_size));\
memcpy (&(b), &(c), (i_size)); }

// easter egg
#define mc_ADVICE if (argc > 1 && 0 == strncmp (argv[1], "-advice", 10) ) \
{ \
(void)printf ("Don't Panic!\n"); \
exit (42); \
} /* 42: The meaning of life, the universe, and everything. */


// functions for qsort
int fcmp (const void *, const void *);
int fcmp1 (const void *, const void *);


struct stat s_swap; // swap variable
int Nodo = 0;
int Quiet = 0;
_Bool b_inv = false; // invers mode indicator
int Files_deleted = 0;
int Blocks_reclaimed = 0;
int i_files_expanded = 0;
int i_blocks_declaimed = 0;


int
main (int argc, char *argv[])
{
char **names, buf[BUFSIZ], *cp; // BUFSIZE is the default buffer size in stdio.h, usually 8192
int nfiles, i;
FILE *tmp;

mc_ADVICE;
// read parameters for quiet mode and suppressing the actual unlinking and relinking. Rolf Freitag
while ((i = getopt (argc, argv, "inq")) != EOF)
{
switch (i)
{
case 'n':
Nodo = 1;
break;
case 'q':
Quiet = 1;
break;
case 'i':
b_inv = true;
default:
break;
}
}

// Read list of file names into temp file and count
tmp = tmpfile ();
if (NULL == tmp)
{
(void) fprintf (stderr, "could not open temporary file, exiting\n");
exit (-1);
}
nfiles = 0;
while (fgets (buf, sizeof (buf), stdin), !feof (stdin))
{
nfiles++;
if (EOF == fputs (buf, tmp))
{
(void) fprintf (stderr, "could not write to temporary file, exiting\n");
exit (-1);
}
}
//Now that we know how many there are, allocate space and re-read */
rewind (tmp);
if ((names = (char **) calloc (nfiles, sizeof (char *))) == NULL) // malloc changed to safer calloc.
{
(void) fprintf (stderr, "%s: Out of memory\n", argv[0]);
exit (1);
}
for (i = 0; i < nfiles; i++)
{
(void) fgets (buf, sizeof (buf), tmp);
if ((cp = strchr (buf, '\n')) != NULL)
*cp = '\0'; // Chomp newlines
if ((names[i] = strndup (buf, BUFSIZ)) == NULL) // replaced strdup by safer strndup
{
(void) fprintf (stderr, "%s: Out of memory\n", argv[0]);
exit (1);
}
}
(void) fclose (tmp);
if (not b_inv)
{
qsort (names, nfiles, sizeof (char *), fcmp);
if (!Quiet)
(void) printf ("Files deleted: %d, Disk blocks reclaimed: %d\n", Files_deleted, Blocks_reclaimed);
}
else // inverse mode
{
qsort (names, nfiles, sizeof (char *), fcmp1);
if (!Quiet)
(void) printf ("Links expanded: %d, Disk blocks declaimed: %d\n", i_files_expanded, i_blocks_declaimed);
}
exit (0);
} // main



// This is the comparison function called by qsort, where the real work
// is done as a side effect. Due to ANSI-C the current version is not
// correct because if the same objects are passed more than one to the
// comparison function the results must be consistent with another, 7.20.5 C99.
// If errors like lstat failed do occur, this is not fullfilled yet.
int
fcmp (const void *a, const void *b)
{
struct stat sa, sb;
FILE *fa, *fb;
int c1, c2;
int rval = 0;
const char *filea, *fileb;

// Nonexistent or non-plain files are less than any other file
if (NULL == a)
return -1;
filea = *(const char **) a;
if ((-1 == lstat (filea, &sa)) or ! S_ISREG (sa.st_mode))
{
fprintf (stderr, "lstat(%s) failed\n", filea);
return -1;
}

if (NULL == b)
return 1;
fileb = *(const char **) b;
if ((-1 == lstat (fileb, &sb)) or ! S_ISREG (sb.st_mode))
{
fprintf (stderr, "lstat(%s) failed\n", fileb);
return 1;
}
// Smaller files are "less"
if (sa.st_size < sb.st_size)
return -1;
if (sa.st_size > sb.st_size)
return 1;
if ((sa.st_dev == sb.st_dev) and (sa.st_ino == sb.st_ino))
return 0; // Files are hard linked

// We now know both files exist, are plain files, are the same size,
// and are not already linked, so compare their contents
if (NULL == (fa = fopen (filea, "r")))
return -1; // Unreadable files are "less than"
if (NULL == (fb = fopen (fileb, "r")))
{
fclose (fa);
return 1;
}
while (((c1 = fgetc (fa)) != EOF) and ((c2 = fgetc (fb)) != EOF)) // compare the two files
{
if (c1 < c2)
{
rval = -1;
break;
}
else
{
if (c1 > c2)
{
rval = 1;
break;
}
}
}
(void) fclose (fa);
(void) fclose (fb);

if ((0 == rval) and (sa.st_dev == sb.st_dev) and (sa.st_size > 0))
{
// Files are identical, have size > 0 and are on the same device, so link them.
// We prefer to keep the older copy, or if they're the
// same date, the one with more (hard) links
if ((sb.st_mtime > sa.st_mtime) or (sb.st_nlink < sa.st_nlink))
{
mc_SWAP_ITEMS (sa, sb, s_swap, sizeof (struct stat)); // swap items to keep original sb
mc_SWAP_ITEMS (filea, fileb, s_swap, sizeof (char *)); // swap file name pointers
}
if (1 == sa.st_nlink) // file a is no (hard) link
{
Files_deleted++;
Blocks_reclaimed += sa.st_blocks;
}
if (!Nodo and (unlink (filea)))
{
(void) fprintf (stderr, "unlink(%s) failed\n", filea);
perror ("unlink");
exit (1);
}
if (!Quiet)
(void) printf ("ln %s %s: %d->%d, %d->%d\n", fileb, filea, sb.st_nlink, sb.st_nlink + 1, sa.st_nlink, sa.st_nlink - 1);
if (!Nodo and (-1 == link (fileb, filea)))
{
(void) fprintf (stderr, "link(%s,%s) failed\n", fileb, filea);
perror ("link");
exit (1);
}
}
return rval;
} // fcmp



// This is the second comparison function for inverse mode
// compare: if both files have the same inode on same device:
// delete the first file and copy the second to the first
int
fcmp1 (const void *a, const void *b)
{
struct stat sa, sb;
FILE *fa, *fb;
int i_c;
int rval = 0;
const char *filea, *fileb;
// char cline[0xfff] = { '\0' }; // command line

// Nonexistent or non-plain files are less than any other file
if (NULL == a)
return -1;
filea = *(const char **) a;
if ((-1 == lstat (filea, &sa)) or ! S_ISREG (sa.st_mode))
{
fprintf (stderr, "lstat(%s) failed\n", filea);
return -1;
}
if (NULL == b)
return 1;
fileb = *(const char **) b;
if ((-1 == lstat (fileb, &sb)) or ! S_ISREG (sb.st_mode))
{
fprintf (stderr, "lstat(%s) failed\n", fileb);
return 1;
}
// Smaller files are "less"
if (sa.st_size < sb.st_size)
return -1;
if (sa.st_size > sb.st_size)
return 1;
if ((sa.st_dev == sb.st_dev) and (sa.st_ino == sb.st_ino))
{ // Files are hard linked
i_files_expanded++;
i_blocks_declaimed += sb.st_blocks;
if (!Nodo) // expand the hard link
{
if (remove (filea)) // remove didn't worked
{
(void) fprintf (stderr, "remove(%s) failed\n", filea);
perror ("unlink");
exit (1);
}
// copy fa into fb
if ((NULL == (fa = fopen (filea, "w"))) or (NULL == (fb = fopen (fileb, "r"))))
{
(void) fprintf (stderr, "expanding the hard link(%s) failed; lost %s\n", fileb, filea);
perror ("expand");
return 1;
}
while ((i_c = fgetc (fb)) != EOF) // copy fb to fa
fputc (i_c, fa);
(void) fclose (fa);
(void) fclose (fb);
/* old version with cp; works not e. g. on SuSE 9.2 after prelink -afmRv
snprintf (cline, sizeof (cline) - 1, "cp -a \"%s \" \"%s \"", fileb, filea);
if (system (cline))
{
(void) fprintf (stderr, "expanding the hard link(%s) failed; lost %s\n", fileb, filea);
perror ("expand");
exit (1);
}
*/
if (!Quiet)
(void) printf ("expand %s %s: %d->%d\n", fileb, filea, sb.st_nlink, sb.st_nlink - 1);
}
}
else // files with same size but different inode: files with lower idev or inode are less
{
if (sa.st_dev != sb.st_dev)
rval = (sa.st_dev < sb.st_dev) ? -1 : 1;
else
rval = (sa.st_ino < sb.st_ino) ? -1 : 1;
}
return rval;
} // fcmp1