Monday, November 9, 2009

Roman numerals

I'm reading Mark Pilgrim's new version of Dive Into Python, and he spends some time with Roman numerals as an example for regular expresssions. (In fact, I like it, and him, so much I bought a dead trees copy). But the example piqued my interest. It's the sort of problem Python does really well. And it has the advantage of having a forward and backward transformation (to standard representation), which allows one to test all valid cases. Not too often you see that in a programming problem. It is a good challenge for a beginning Python coder.


  928     CMXXVIII    928
1070 MLXX 1070
30 XXX 30
581 DLXXXI 581
1976 MCMLXXVI 1976
169 CLXIX 169
1940 MCMXL 1940
975 CMLXXV 975
204 CCIV 204
1044 MXLIV 1044


I know that code is boring, but I've decided to put it up (all 70 lines of it), in four parts. The strategy here is to separate the task in two sub-tasks: conversion between decimal and "verbose" Roman, and "compression."

The first part does what the comment says. No error checking:


def compress(s):
# convert 'X'*4 to 'XV', etc.
L = list('IVXLCDM')
c = s[0]
i = L.index(c)
n = len(s)
if n < 4: return s
if n == 4: return c + L[i+1]
if n == 5: return L[i+1]
if 5 < n < 9: return L[i+1] + c*(n-5)
if n == 9: return c + L[i+2]


It's a matter of personal taste, but I tend to use single letter variable names when I can. The clutter bothers me more than the chance for confusion. But I'm consistent (L is always a list, s a string, i an index). Actually, you don't need a list here, a string would be fine. But I wanted to use L since s was already claimed :)

The second part does some trivial error checking. Then it breaks down a decimal year into thousands, hundreds, etc, and adds the appropriate number of 'IXCM' to a list, after running them through compress. Rather than check if it's needed, everything goes through.


def decimalToRoman(n):
assert type(n) == type(1),'not an int'
assert 0 < n < 10000,'out of bounds %i' % n
divs = [1000, 100, 10, 1]
vals = list()
for i in range(4):
d = divs[i]
vals.append(n/d)
n = n%d
rL = list()
rNumerals = 'MCXI'
for i,v in enumerate(vals):
c = rNumerals[i]
s = c * v
if i and v: s = compress(s)
rL.append (s)
return ''.join(rL)


The third function does the reverse transformation. It has a nested function definition for the reverse of the "compress" algorithm. There is a second function necessitated by the fact that there may be more than one compression in a given Roman numeral. (A bug in the first version!). In "workToDo", we return -1 when there is no more work.


def romanToDecimal(s):
D = { 'I':1, 'V':5,
'X':10, 'L':50,
'C':100,'D':500, 'M':1000 }
def resolve(s,i):
c = s[i]
first = s[:i]
last = s[i+2:]
if s[i+1] in 'VLD': x = 4
else: x = 9
s2 = first + c*x + last
return s2
def workToDo(s):
for i in range(len(s)-1):
if D[s[i]] < D[s[i+1]]:
return i
return -1
i = workToDo(s)
while i != -1:
s = resolve(s,i)
i = workToDo(s)
n = 0
for c in s: n += D[c]
return n


Finally, the test harness. We print a few random numbers for fun, and then test all the years from 1 to 10,000, going from decimal to Roman and back again.


def test():
for i in range(10):
d = random.choice(range(1,2000))
print str(d).rjust(5),
r = decimalToRoman(d)
d = romanToDecimal(r)
print r.rjust(12),
print str(d).rjust(6)

for d in range(1,10000):
r = decimalToRoman(d)
s = romanToDecimal(r)
if not d == s:
print 'error'
print str(d).rjust(5),
print r.rjust(12),
print s.rjust(6)
test()