Ivory: A Hadoop toolkit for web-scale information retrieval research ivoryproject ivory team api publications experiments home hadoop retrieval web information research scale release toolkit notes collection tar clueweb experimental indexing lucene cloud javadoc pairwise pipeline katta gov disks trec similarity xhtml css github project neoease preprocessing computation results document Ivoryproject.org~Site InfoWhoisTrace RouteRBL Check
Cloud9: A MapReduce Library for Hadoop cloud9lib mapreduce hadoop library cloud design exercises home algorithms api patterns working getting started guide data text records clueweb collection processing intensive collections ibm document google github staging sequencefiles standard standalone pig complex mode clue cluster sample frequently user questions Cloud9lib.org~Site InfoWhoisTrace RouteRBL Check