March 25th, 2006

belize

$good->meets($evil)

Dear Graph Theorists and Wanna-Be Graph Theorists (like me),

Please help.

My fun Saturday morning project was writing a test harness so people like you could show me up, and show each other up, by writing an awesome graph analysis algorithm.

What does it have to do?

Separate out the "good nodes" from the "bad nodes".

Restrictions:
  • You implement one function, "calc", which takes a node, and returns a trust values 0.0-1.0 (inclusive), or undef if you don't have a solid answer.

  • Your calc function will run hundreds or thousands of times for each node, and in no particular order. The trust value of each node is constantly changing.

  • You can't keep state between invocations. If you try, it's cheating, and won't work in reality anyway when dozens/hundreds of processes are running independently.

  • All you can get from a provided node object is its current trust level, all nodes connected by inbound edges, and their trust levels. You can keep going deeper and deeper if you want, but:

  • I won't let you go more than two levels deep. You can't recurse the graph forever searching for stuff. That's why I also don't let you walk the graph forward.

The test harness starts out like this:
my $good = TrustGraph::Graph->load(file => "$Bin/data/good-network.dat.gz");

Wherein we load the first 100,000 LJ users' relationships amongst each other.

Then this:
my $evil = TrustGraph::Graph->generate(nodes => 10_000,
                           edges => 30_000);

That makes a 10,000 node graph of "evil" entities, with 30,000 edges amonst themselves.

Then we merge them:
my $all = $good->meets($evil);
But they're not linked yet. They just exist in the same graph, with two major islands.

We mark some as root nodes (in the good graph), to see it all. They have a fixed trust of 1.0:
# mark certain prominent members as trusted root nodes
foreach my $goodid (qw(2 10 14 15 1571)) {
    $good->node($goodid)->mark_as_root;
}
Now, we take a bunch of confused losers who accidentally befriend evil people:
for (1..$confused) {
    $good->random_node->links($evil->random_node);
}
Then we run your calc function on the collective graph:
$all->reset_trust;
$all->mass_compute(
                   iterations => 750_000,
                   func => $calc_func,
                   every => [10_000, $dump_stats],
                   );

As it goes it dumps stats, so you can abort it early if it's looking like it's sucking:

$ ./simulate --module=Brad --confused=2000

PROGRESS: 150000 / 750000 (20.00%)
$VAR1 = {
          'good' => {
                      'coverage' => '0.564841498559078',
                      'avg' => '0.813768195517867',
                    },
          'evil' => {
                      'avg' => '0.345016327849719',
                    }
        };
GOOD:
0.00 - 0.05: 
0.05 - 0.10: 
0.10 - 0.15: 
0.15 - 0.20: 
0.20 - 0.25: 
0.25 - 0.30: 
0.30 - 0.35: #
0.35 - 0.40: #
0.40 - 0.45: #
0.45 - 0.50: ##
0.50 - 0.55: 
0.55 - 0.60: ###
0.60 - 0.65: ###
0.65 - 0.70: 
0.70 - 0.75: #############
0.75 - 0.80: 
0.80 - 0.85: ##########
0.85 - 0.90: 
0.90 - 0.95: ########
0.95 - 1.00: ########################################

BAD:
0.00 - 0.05: ###################
0.05 - 0.10: ##################################
0.10 - 0.15: ###########################
0.15 - 0.20: ########################################
0.20 - 0.25: #################################
0.25 - 0.30: ########################
0.30 - 0.35: ###############################
0.35 - 0.40: #################################
0.40 - 0.45: #######################
0.45 - 0.50: ################
0.50 - 0.55: ############
0.55 - 0.60: ######################
0.60 - 0.65: #################
0.65 - 0.70: 
0.70 - 0.75: #####################
0.75 - 0.80: 
0.80 - 0.85: ###########
0.85 - 0.90: 
0.90 - 0.95: ####
0.95 - 1.00: ##

TODO:
  • Prettier output, use GD for graphs
  • more attack models (notable all evil nodes linking to good nodes) Update: done
  • better harness for running/evaluating lots of algorithms
  • .....
Want to write a calc function?

$ svn co http://code.sixapart.com/svn/TrustGraph/trunk TrustGraph

If you write a good one, tease others in the comments here with the output.