Introduction to Disco, a MapReduce framework built in Python and Erlang.
Reviews the basic MapReduce paradigm, dataflow, file and job distribution, and goes on the explain the Disco Distributed Filesystem (DDFS) before going into some use-case scenarios in next generation genomic sequencing.
Comment on This Data Unit