l i n u x - u s e r s - g r o u p - o f - d a v i s
Next Meeting:
July 7: Social gathering
Next Installfest:
Latest News:
Jun. 14: June LUGOD meeting cancelled
Page last updated:
2011 Jun 17 13:07

The following is an archive of a post made to our 'vox mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
Re: [vox] Who thinks Java is cool?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vox] Who thinks Java is cool?

On 06/16/2011 09:13 AM, Alex Mandel wrote:
> I'm curious to hear yours and Bill's take on Celery:
> http://celeryproject.org/

Interesting.  I'm not sure exactly what it's meant for though.  I am fan
of message passing, and in particular ones that are designed to be light
weight and async.

I've written some code with AMP, using it as the basis for a p2p
distributed backup system with performance as one of the primary goals.

Amusing the examples are identical, although somewhat more verbose than
the celery project's:

class JustSum(amp.AMP):
       def sum(self, a, b):
           total = a + b
           print 'Did a sum: %d + %d = %d' % (a, b, total)
           return {'total': total}

class Sum(amp.Command):
       arguments = [('a', amp.Integer()),
                    ('b', amp.Integer())]
       response = [('total', amp.Integer())]

> I did talk to several people at a recent conference who were doing
> multi-processing in python and seemed happy with it, but they were
> Geographers not Computer Science and in Geography python is the language
> to go with these days for interoperability and libraries.

Yeah, I'm pretty pleased with multi-processing, I spent a weekend trying
to figure out why nothing was scaling with my python implementation
before I found (and cursed) the GIL.

Basically I wanted the following threads:
* one to talk each file system (per disk head/RAID) and find changed
* one per CPU core/thread for for calculating a SHA256 or Skein
  checksums and encryption
* one for feeding encrypted checksummed blobs to the p2p

Multiprocessing handles this every well, I just setup a queue and I get
perfect scaling.

Then the P2P server would talk to it's peers trading encrypted blobs by
their checksum to insure the (user defined) redundancy was met, and to
keep it's peers honest with challenges.  I wanted to use AMP so I could
say things like:
 1) peer A -> B: Do you have any of these blobs <list of checkums>
 2) peer B -> A: I have these <list>
 3) peer A -> B: please store these <list>
 4) peer A -> B: please register me for these blobs you already have

AMP seemed ideal for this, lightweight, async, fast.  I was hoping to
sustain a decent fraction of a GigE network connection even when
handling small encrypted blobs (because often the average file size in a
backup is small).

More info on AMP:

So I guess I don't see the hard part that celery solves.  I was hoping
for more detail, maybe a presentation or paper.  I glanced at the docs
without really finding anything particularly unique.

In general I'd stick with something popular (like twisted) or part of
the language standard (like multiprocessing) unless there was a
significant improvement.  I've seen many small projects die leaving
anyone using them stranded.
vox mailing list

LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
EDGE Tech Corp.
For donating some give-aways for our meetings.