Tuesday, March 15, 2011

How to get the size of a gunzipped file in vim

When viewing (or editing) a .gz file, vim knows to locate gunzip and display the file properly.
In such cases, getfsize(expand("%")) would be the size of the gzipped file.

Is there a way to get the size of the expanded file?

[EDIT]
Another way to solve this might be getting the size of current buffer, but there seems to be no such function in vim. Am I missing something?

From stackoverflow
  • If you're on Unix/linux, try

    :%!wc -c
    

    That's in bytes. (It works on windows, if you have e.g. cygwin installed.) Then hit u to get your content back.

    HTH

    Zsolt Botykai : Someone should explain why it was downvoted, as it works?
  • There's no easy way to get the uncompressed size of a gzipped file, short of uncompressing it and using the getfsize() function. That might not be what you want. I took at a look at RFC 1952 - GZIP File Format Specification, and the only thing that might be useful is the ISIZE field, which contains "...the size of the original (uncompressed) input data modulo 2^32".

    EDIT:

    I don't know if this helps, but here's some proof-of-concept C code I threw together that retrieves the value of the ISIZE field in a gzip'd file. It works for me using Linux and gcc, but your mileage may vary. If you compile the code, and then pass in a gzip'd filename as a parameter, it will tell you the uncompressed size of the original file.

    #include <stdio.h>
    #include <stdlib.h>
    #include <errno.h>
    #include <string.h>
    
    int main(int argc, char *argv[])
    {
        FILE *fp = NULL;
        int  i=0;
    
        if ( argc != 2 ) {
            fprintf(stderr, "Must specify file to process.\n" );
            return -1;
        }
    
        // Open the file for reading
        if (( fp = fopen( argv[1], "r" )) == NULL ) {
            fprintf( stderr, "Unable to open %s for reading:  %s\n", argv[1], strerror(errno));
            return -1;
        }
    
        // Look at the first two bytes and make sure it's a gzip file
        int c1 = fgetc(fp);
        int c2 = fgetc(fp);
        if ( c1 != 0x1f || c2 != 0x8b ) {
            fprintf( stderr, "File is not a gzipped file.\n" );
            return -1;
        }
    
    
        // Seek to four bytes from the end of the file
        fseek(fp, -4L, SEEK_END);
    
        // Array containing the last four bytes
        unsigned char read[4];
    
        for (i=0; i<4; ++i ) {
            int charRead = 0;
            if ((charRead = fgetc(fp)) == EOF ) {
                // This shouldn't happen
                fprintf( stderr, "Read end-of-file" );
                exit(1);
            }
            else
                read[i] = (unsigned char)charRead;
        }
    
        // Copy the last four bytes into an int.  This could also be done
        // using a union.
        int intval = 0;
        memcpy( &intval, &read, 4 );
    
        printf( "The uncompressed filesize was %d bytes (0x%02x hex)\n", intval, intval );
    
        fclose(fp);
    
        return 0;
    }
    
  • This appears to work for getting the byte count of a buffer

    (line2byte(line("$")+1)-1)

  • From within vim editor, try this:

    <Esc>:!wc -c my_zip_file.gz
    

    That will display you the number of bytes the file is having.

0 comments:

Post a Comment