Sunday 6 July 2014

Ugly code and Palindromes

When you try to figure out patterns yourself, and fail to find a solution for all cases, it looks something like this:

(Python code that generates palindromes, as opposed to checking whether each natural number is a palindrome or not)

import timeit
NUM = 100000

def get_palindromes(n):
  for i in range(1, n+1):
    if str(i)[::-1] == str(i):
      yield i

def get_palindromes_2(n, s=1):
  while s <= n:
    yield s
    if (s / (10 ** ( len(str(s)) / 2 ))) % 10 == 0 and (len(str(s)) % 2 == 1):
      s += 10 ** ( len(str(s))/2 )
    elif s == 9 or s == 99 or s == 999 or s == 9999 or s == 99999:
      s += 2
    elif ((s % 10 == 9 and (s / 10) % 10 == 9)) and s / 10 ** (len(str(s))-1) != 9:
      s += 2
    elif s / 10 == 0:
      s += 1
    elif s / 100 == 0:
      s += 11
    elif (s / (10 ** ( len(str(s)) / 2 ))) % 10 == 9 and (s / 10) % 10 == 9:
      s += 11
    elif (s / (10 ** ( len(str(s)) / 2 ))) % 10 == 9:
      if len(str(s)) % 2 == 1:
        s += 11 * (10 ** (len(str(s))/2 - 1))
      else:
        s += 11 * (10 ** ( len(str(s)) / 2 - len(str(s))/2 ))
    elif len(str(s)) % 2 == 1:
      s += 10 ** ( len(str(s))/2 )
    else:
      s += 11 * 10 ** ( len(str(s)) - len(str(s))/2 - 1 )


if __name__ == '__main__':
  print "Checking for all palindromes less than %s:" % (NUM,)
  print timeit.timeit('list(get_palindromes(NUM))',
    setup='from __main__ import get_palindromes, NUM',
    number=10)

  print "Generating all palindromes less than %s" % (NUM,)
  print timeit.timeit('list(get_palindromes_2(NUM))',
    setup='from __main__ import get_palindromes_2, NUM',
    number=10)
  
  print "Both return same results:"
  print list(get_palindromes(NUM)) == list(get_palindromes_2(NUM))

The results suggest the generating function is more than 20 times faster.


Checking for all palindromes less than 100000:
0.631178268752
Generating all palindromes less than 100000
0.0270419578788
Both return same results:
True
[Finished in 0.8s]

There might be a better palindrome generating algorithm here, haven't tested it though. Then again, it uses strings which are then converted to integers.

Don't do this in production or anywhere someone other than you reads the code you write.

Thursday 3 July 2014

More primes

Let's see how quickly we can calculate primes less than a million in other languages. Sieve of Eratosthenes is used again without any non-obvious language-specific optimizations.

For comparison, Python with the default interpreter takes 375ms.

Java


import java.util.ArrayList;
import java.util.Arrays;
import java.util.Date;

public class Primes {
  public static ArrayList<Integer> gen_primes(int n) {
    boolean[] temp = new boolean[n];
    Arrays.fill(temp, true);
    temp[0] = temp[1] = false;

    ArrayList<Integer> primes = new ArrayList<Integer>();
    for(long i = 0 ; i < n ; i++) {
      if (temp[(int) i]) {
        primes.add((int) i);
        for(long j = i * i ; j < n ; j += i) {
          temp[(int) j] = false;
        }
      }
    }
    return primes;
  }

  public static void main(String[] args) {
    Date d = new Date(); long c = d.getTime();
    ArrayList<Integer> p = gen_primes(1000000);
    System.out.println((new Date()).getTime() - c);
  }
}

Possibly the fastest, takes around 20ms!


Node / JS


(function() {
  function gen_primes(n) {
    var temp = [];
    for( var k = 0; k < n; k++ ) {
      temp.push(true);
    }
    temp[0] = temp[1] = false;
    var primes = [];
    for( var i = 0; i < n; i++) {
      if (temp[i]) {
        primes.push(i);
        for( var j = i * i; j < n ; j += i ) {
          temp[j] = false;
        }
      }
    }
  }
  var now = Date.now();
  gen_primes(1000000);
  console.log(Date.now() - now);
})();

Node does it in around 90ms, while in the browser it takes anywhere between 100-200ms.


C#


using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Collections;

namespace primes_test
{
    class Program
    {
        public static ArrayList gen_primes(int n)
        {
            bool[] temp = new bool[n];
            for (int i = 0; i < n; i++)
            {
                temp[i] = true;
            }
            temp[0] = temp[1] = false;

            ArrayList primes = new ArrayList();
            for (long i = 0; i < n; i++)
            {
                if (temp[(int)i])
                {
                    primes.Add((int)i);
                    for (long j = i * i; j < n; j += i)
                    {
                        temp[(int)j] = false;
                    }
                }
            }
            return primes;
        }

        static void Main(string[] args)
        {
            DateTime begin = DateTime.UtcNow;
            ArrayList a = gen_primes(1000000);
            DateTime end = DateTime.UtcNow;
            Console.WriteLine((end - begin).TotalMilliseconds);
            Console.ReadKey();
        }
    }
}

Almost as fast as Java, 22ms.

Generating Primes

Using the Sieve of Eratosthenes, because I keep forgetting.

import timeit
NUM = 1000000

def generate_primes(n):
    nums = [i for i in xrange(n)]
    for i in xrange(2, n):
        if nums[i]:
            yield nums[i]
            for j in range(i*i, n, i):
                nums[j] = False

def gen_primes(n):
    primes = [True] * n
    primes[0] = primes[1] = False
    
    for (i, isprime) in enumerate(primes):
        if isprime:
            yield i
            for j in range(i*i, n, i):
                primes[j] = False

if __name__ == "__main__":
    # print timeit.timeit("list(gen_primes(NUM))", 
        # setup = "from __main__ import gen_primes, NUM", number=1)

    # print timeit.timeit("list(generate_primes(NUM))", 
    #   setup = "from __main__ import generate_primes, NUM", number=1)

gen_primes is from SO: 3939660

generate_primes is what I came up with. It is slower, and because Windows, I cannot find out why, yet. (Might be because xrange(n) is slower than enumerate(n) and / or [True] * n is faster than xrange(n))

The timeit print statements must be run one at a time, running both gives different results than if they are run individually.

Finding all primes less than a million takes ~0.4s on my machine (i5-661, 8 gigs of RAM)