had

hadoopy

Python MapReduce library written in Cython. Visit us in #hadoopy on freenode. See the link below for documentation and tutorials.

Showing:

Popularity

Downloads/wk

0

GitHub Stars

244

Maintenance

Last Commit

9yrs ago

Contributors

11

Package

Dependencies

0

License

GPLv3

Categories

Readme

Brandyn White bwhite@dappervision.com Andrew Miller amiller@dappervision.com

Source https://github.com/bwhite/hadoopy/ Issues https://github.com/bwhite/hadoopy/issues Docs http://bwhite.github.com/hadoopy/

IRC: #hadoopy @ freenode.net

Requirements python development headers (python-dev), build tools (build-essential)

Optional cython (>=.13) (without this it falls back to the pregenerated .c files)

Features

  • oozie support
  • Automated job parallelization 'auto-oozie' available in the hadoopy_flow project (maintained out of branch)
  • typedbytes support (very fast)
  • Local execution of unmodified MapReduce job with launch_local
  • Read/write sequence files of TypedBytes directly to HDFS from python (readtb, writetb)
  • Works on OS X
  • Allows printing to stdout and stderr in Hadoop tasks without causing problems (uses the 'pipe hopping' technique, both are available in the task's stderr)
  • critical path is in Cython
  • works on clusters without any extra installation, Python, or any Python libraries (uses Pyinstaller that is included in this source tree)
  • Simple HDFS access (readtb and ls) inside Python, even inside running jobs
  • Unit test interface
  • Reporting using status and counters (and print statements! no need to be scared of them in Hadoopy)
  • Supports design patterns in the Lin/Dyer book (http://www.umiacs.umd.edu/~jimmylin/book.html)

Limitations

Used in

  • A Case for Query by Image and Text Content: Searching Computer Help using Screenshots and Keywords (to appear in WWW'11)
  • Web-Scale Computer Vision using MapReduce for Multimedia Data Mining (at KDD'10)
  • Vitrieve: Visual Search engine
  • Picarus: Hadoop computer vision toolbox

Ubuntu Install (others are similar) sudo apt-get install python-dev build-essential sudo python setup.py install

Rate & Review

Great Documentation0
Easy to Use0
Performant0
Highly Customizable0
Bleeding Edge0
Responsive Maintainers0
Poor Documentation0
Hard to Use0
Slow0
Buggy0
Abandoned0
Unwelcoming Community0
100
No reviews found
Be the first to rate

Alternatives

No alternatives found

Tutorials

No tutorials found
Add a tutorial