hpc.social


High Performance Computing
Practitioners
and friends /#hpc
Share: 
This is a crosspost from   Jonathan Dursi R&D computing at scale. See the original post here.

On Random vs. Streaming I/O Performance; Or seek(), and You Shall Find --- Eventually.

At the Simpson Lab blog, I’ve written a post on streaming vs random access I/O performance, an important topic in bioinformatics. Using a very simple problem (randomly choosing lines in a non-indexed text file) I give a quick overview of the file system stack and what it means for streaming performance, and reservoir sampling for uniform random online sampling.