I am looking to get accurate (i.e. the real size on disk and not the normal size that includes all the 0\'s) measurements of sparse files in Java.
In C++ on Windows one
If you want a pure Java solution you can try jnr-posix. Here's an example implementation
import jnr.posix.*;
final POSIX p = POSIXFactory.getPOSIX();
final int S_BLKSIZE = 512; // from sys/stat.h
final FileStat stat = p.stat("/path/to/file");
final long bytes = stat.blocks() * S_BLKSIZE;
However currently the function won't work for Windows. Until that's fixed you have to use platform-specific code like below
On Linux use the stat64 system call
The st_blocks field indicates the number of blocks allocated to the file, 512-byte units. (This may be smaller than st_size/512 when the file has holes.)
Blocks
field, or printed with the %b
format specifier--apparent-size
option)
--apparent-size
- print apparent sizes, rather than disk usage; although the apparent size is usually smaller, it may be larger due to holes in ('sparse') files, internal fragmentation, indirect blocks, and the like
On Windows you can call the GetCompressedFileSize API
Alternatively you can also run fsutil file layout
with admin rights to get detailed information about a file. Find the $DATA
stream.
If you see Resident | No clusters allocated in the flags like this then it's a resident file and size on disk would be 0.
PS C:\Users> fsutil file layout .\desktop.ini
********* File 0x000800000003dbde *********
File reference number : 0x000800000003dbde
File attributes : 0x00000026: Hidden | System | Archive
File entry flags : 0x00000000
Link (ParentID: Name) : 0x001f0000000238c8: HLINK Name : \Users\desktop.ini
...
Stream : 0x080 ::$DATA
Attributes : 0x00000000: *NONE*
Flags : 0x0000000c: Resident | No clusters allocated
Size : 174
Allocated Size : 176
If you don't see the resident flag then check the Allocated Size field, it's the file's size on disk
PS D:\> fsutil file layout .\nonresident.txt
********* File 0x000400000000084e *********
File reference number : 0x000400000000084e
File attributes : 0x00000020: Archive
File entry flags : 0x00000000
Link (ParentID: Name) : 0x0005000000000005: HLINK Name : \nonresident.txt
...
Stream : 0x080 ::$DATA
Attributes : 0x00000000: *NONE*
Flags : 0x00000000: *NONE*
Size : 1,520
Allocated Size : 4,096
Extents : 1 Extents
: 1: VCN: 0 Clusters: 1 LCN: 1,497,204
For more information you can read the below questions
Since an answer was given for windows. i will try to supply for Linux.
I am not sure, but i think it will do the trick (C++):
#include <linux/fs.h>
ioctl(file, BLKGETSIZE64, &file_size_in_bytes);
This can be loaded in the same way that was described in the @Aniket answer (JNI)
If you are doing it on Windows alone, you can write it with Java Native Interface
class NativeInterface{
public static native long GetCompressedFileSize(String filename);
}
and in C/C++ file:
extern "C"
JNIEXPORT jlong JNICALL Java_NativeInterface_GetCompressedFileSize
(JNIEnv *env, jobject obj, jstring javaString)
{
const char *nativeString = env->GetStringUTFChars(javaString, 0);
char buffer[512];
strcpy(buffer, nativeString);
env->ReleaseStringUTFChars(javaString, nativeString);
return (jlong) GetCompressedFileSize(buffer, NULL);
}