Child pages
  • HPC Performance Considerations

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • HPC storage has been purchased to deliver expected capacity requirements - evaluation of vendor responses is dominated by $/TB. 
  • Our current (2021) general use storage platform is very much entry level capabilities when it comes to performance.
  • Researchers should avoid running parallel/concurrent tar, gzip, etc. commands.  Utilities such as pbzip2 should also be avoided.
    • If your computational research is I/O intensive, ensure that it is configured to use local scratch space (/fast)
    • Consider other researchers and create a separate, sequential operations job for all your post-job I/O transactions.
  • Compression tools don't provide much benefit when working on binary data files - it's generally better to leave binary files uncompressed.

Finally, a warning for zip users: Do not use zip when you are working with more than 2GB of files - you may have difficulty with unzip.  If you have created a large zip file already, one solution that has worked for me is to perform the unzip on OSX (MAC).  I cannot guarantee this work-around is robust. Generally, I'd recommend using tar for grouping multiple files into a single file. Use gzip or bzip2 for compression, if wanted or required.  Note that compression can be done as part of the tar command, for example:

No Format
tar cfz proj1.tar.gz proj1/

will tar and compress the contents of a proj1 subdirectory.

Performance versus block size (read/write)

The following output shows the performance achievable, on SATA SSDs, for reads/writes of differing block sizes.

Block size 1K Write: 256000000 bytes (256 MB, 244 MiB) copied, 10.0142 s, 25.6 MB/s
Block size 1K Read : 256000000 bytes (256 MB, 244 MiB) copied, 9.18361 s, 27.9 MB/s
Block size 2K Write: 256000000 bytes (256 MB, 244 MiB) copied, 5.29912 s, 48.3 MB/s
Block size 2K Read : 256000000 bytes (256 MB, 244 MiB) copied, 4.87071 s, 52.6 MB/s
Block size 4K Write: 256000000 bytes (256 MB, 244 MiB) copied, 2.97716 s, 86.0 MB/s
Block size 4K Read : 256000000 bytes (256 MB, 244 MiB) copied, 2.54667 s, 101 MB/s
Block size 8K Write: 256000000 bytes (256 MB, 244 MiB) copied, 1.51576 s, 169 MB/s
Block size 8K Read : 256000000 bytes (256 MB, 244 MiB) copied, 1.3195 s, 194 MB/s
Block size 16K Write: 256000000 bytes (256 MB, 244 MiB) copied, 0.843084 s, 304 MB/s
Block size 16K Read : 256000000 bytes (256 MB, 244 MiB) copied, 0.676777 s, 378 MB/s
Block size 32K Write: 255983616 bytes (256 MB, 244 MiB) copied, 0.487291 s, 525 MB/s
Block size 32K Read : 255983616 bytes (256 MB, 244 MiB) copied, 0.411139 s, 623 MB/s
Block size 64K Write: 255983616 bytes (256 MB, 244 MiB) copied, 0.287594 s, 890 MB/s
Block size 64K Read : 255983616 bytes (256 MB, 244 MiB) copied, 0.248797 s, 1.0 GB/sBlock size 128K Write: 255983616 bytes (256 MB, 244 MiB) copied, 0.324746 s, 788 MB/s
Block size 128K Read : 255983616 bytes (256 MB, 244 MiB) copied, 0.153213 s, 1.7 GB/s
Block size 256K Write: 255852544 bytes (256 MB, 244 MiB) copied, 0.705291 s, 363 MB/s
Block size 256K Read : 255852544 bytes (256 MB, 244 MiB) copied, 0.108035 s, 2.4 GB/s
Block size 512K Write: 255852544 bytes (256 MB, 244 MiB) copied, 0.739421 s, 346 MB/s
Block size 512K Read : 255852544 bytes (256 MB, 244 MiB) copied, 0.0776258 s, 3.3 GB/s
Block size 1024K Write: 255852544 bytes (256 MB, 244 MiB) copied, 0.748344 s, 342 MB/s
Block size 1024K Read : 255852544 bytes (256 MB, 244 MiB) copied, 0.0622923 s, 4.1 GB/s
Block size 2048K Write: 255852544 bytes (256 MB, 244 MiB) copied, 0.801381 s, 319 MB/s
Block size 2048K Read : 255852544 bytes (256 MB, 244 MiB) copied, 0.0554749 s, 4.6 GB/s
Block size 4096K Write: 255852544 bytes (256 MB, 244 MiB) copied, 0.781268 s, 327 MB/s
Block size 4096K Read : 255852544 bytes (256 MB, 244 MiB) copied, 0.0500531 s, 5.1 GB/s
Block size 8192K Write: 251658240 bytes (252 MB, 240 MiB) copied, 0.781807 s, 322 MB/s
Block size 8192K Read : 251658240 bytes (252 MB, 240 MiB) copied, 0.0461191 s, 5.5 GB/s

Using an 8MB block size gains about 200x the read performance achieved with a 1K block size.  The situation is similar when it comes to performance of moving small and large files across a network - the smaller the file, the lower the speed, unless you have already hit line rateThe story told by the brief output above is universal - small files move around at much slower rates than large files.