Diff: ServerLoadSpreading

Differences between version 5 and previous revision of ServerLoadSpreading.

Other diffs: Previous Major Revision, Previous Author

Newer page: version 5 Last edited on August 22, 2011 6:01 pm by PhilHollenback Revert
Older page: version 4 Last edited on May 2, 2011 11:17 pm by PhilHollenback Revert
@@ -50,10 +50,18 @@
  
 need to do some distribution tests against a corpus of common hostnames or something like that. 
  
 why you should use a hash instead of a random delay? 1. Repeatable delay for each host. 2. maybe better guarantee of spread? 
+  
+to test: on each remote host run this:  
+  
+perl -le "print int rand(86400)+1"  
  
 would be interesting to compare these delays to the spread generated by calling bash RANDOM. 
+  
+for bash , run this on remote host:  
+  
+echo "$(((($RANDOM * 3) % 86400) + 1))"  
  
 <verbatim> 
 #!/usr/bin/perl 
 # randomer.pl 
@@ -121,12 +129,12 @@
 plot "random-freq.txt" using 1:2 index 0 title "number of collisions" pt 6 lt 2 
 </verbatim> 
  
 output: 
-[http://www.hollenback.net/junk/spread.png] 
+[http://www.hollenback.net/junk/spread-hash .png] 
  
 collision counts:%%% 
  
 8805 hosts no collisions%%% 
 483 hosts one collision%%% 
 25 hosts two collisions%%% 
 no hosts more than two collisions 

version 5

large numbers of servers contacting a central host

some thoughts on using a perl hash function instead of a random delay

use B;

sub compute_hash
{
  my ($string,$range) = @_;
  my $hc = hex(B::hash $string);
  return $hc % $range + 1;
}

reference: http://www.perlmonks.org/index.pl?node_id=315876

how hashes really work: http://www.perl.com/pub/2002/10/01/hashes.html

calling the perl has function: http://www.perlmonks.org/?node_id=229982

more than you could ever want to know about hashing: http://burtleburtle.net/bob/hash/doobs.html

another hash function that probably isn't very good:

sub compute_hash
{
  my ($string,$low,$high) = @_;
  my $hc = 0;
  my $i = 0;
  my $range = ($high - $low) + 1;

  while ($i < length($string))
  {
    $hc = ((($hc) >> 27) | (($hc) << 5)) ^ ord(substr($string,$i++,1));
  }
  $hc = ((($hc) >> 1) | (($hc) << 31)) ^ ($hc);
  $hc = ((($hc) >> 3) | (($hc) << 29)) ^ ($hc);
  $hc = ((($hc) >> 5) | (($hc) << 27)) ^ ($hc);
  $hc = ((($hc) >> 7) | (($hc) << 25)) ^ ($hc);
  $hc = ((($hc) >> 9) | (($hc) << 23)) ^ ($hc);
  $hc = ((($hc) >> 11) | (($hc) << 21)) ^ ($hc);
  $hc = ((($hc) >> 13) | (($hc) << 19)) ^ ($hc);
  $hc = ((($hc) >> 15) | (($hc) << 17)) ^ ($hc);

  return $low + ($hc % $range);
}

need to do some distribution tests against a corpus of common hostnames or something like that.

why you should use a hash instead of a random delay? 1. Repeatable delay for each host. 2. maybe better guarantee of spread?

to test: on each remote host run this:

perl -le "print int rand(86400)+1"

would be interesting to compare these delays to the spread generated by calling bash RANDOM.

for bash , run this on remote host:

echo "$(((($RANDOM * 3) % 86400) + 1))"

#!/usr/bin/perl
# randomer.pl

# process input in form of
# <hostname> <delay>
#
#and transform into
#
# <delay> <count>
#
# suitable for frequency plotting.

use warnings;
use strict;

my %counter = ();

while(<>)
  {
    (my $delay) = (/^\S+\s+(\S+)/);
    $counter{$delay}++;
  }

foreach (sort {$a<=>$b} keys %counter)
  {
    print $_ . " " . $counter{$_} . "\n";
  }

results in output like this:

1 1
2 1
3 1
7 1
30 1
42 1
47 1
49 1
66 1
71 1
96 1
112 1
126 1
134 1
136 1
140 1

etc.

gnuplot control file:

set terminal png size 1200,300 font "/Library/Fonts/Andale Mono.ttf" 12
set output "spread.png"
set xrange [0:86400]
set yrange [0:4]
set xlabel "delay (s)"
set ylabel "# of hosts"
set title "collisions using perl \"int(rand())\" for delay value - 9313 hosts, 86400s spread"
set ytics 1
set xtics nomirror
set ytics nomirror
set key off
plot "random-freq.txt" using 1:2 index 0 title "number of collisions" pt 6 lt 2

output:

collision counts:

8805 hosts no collisions
483 hosts one collision
25 hosts two collisions
no hosts more than two collisions



Our Founder
ToolboxClick to hide/show