Thursday, March 31, 2011

more: experiment with handwritten digit recognition and SVM

In case you haven't read, read the last post first.

In my last post, I wrote about a little experiment done on handwritten digit recognition using the WEKA explorer GUI. Now, this is the super simple scala script to do the same thing with code instead of gui.

import weka.core.Instances
import weka.core.converters.ArffLoader.ArffReader
import weka.classifiers.{Classifier,Evaluation}
import weka.classifiers.functions.SMO
import java.io.{FileReader,BufferedReader}

object OCR {
def main(args : Array[String]) : Unit = {
val trainingSet = trainingInstances
evaluate(trainingSet,testInstances,svmClassifier(trainingSet))
}

def getInstances(src: String) = {
val reader = new BufferedReader(new FileReader(src))
val arffReader = new ArffReader(reader)
val instances = arffReader.getData()
instances.setClassIndex(instances.numAttributes() - 1)
instances //return the read instances
}

def trainingInstances = getInstances("\\path\\to\\optdigits.tra.arff")
def testInstances = getInstances("\\path\\to\\optdigits.tes.arff")

def svmClassifier(instances: Instances) = {
val smo = new SMO()
smo.buildClassifier(instances)
smo //return the trained multiclass SVM classifier
}

def evaluate(trainingSet: Instances, testSet: Instances, model: Classifier) {
val eval = new Evaluation(trainingSet)
eval.evaluateModel(model,testSet)
println(eval.toSummaryString("Results:\n", true))
}
}

On running above, this is what I get...
Results:

Correctly Classified Instances 1734 96.4942 %
Incorrectly Classified Instances 63 3.5058 %
Kappa statistic 0.961
K&B Relative Info Score 53500.7814 %
K&B Information Score 1777.1913 bits 0.989 bits/instance
Class complexity | order 0 5969.5443 bits 3.322 bits/instance
Class complexity | scheme 4192.4976 bits 2.3331 bits/instance
Complexity improvement (Sf) 1777.0468 bits 0.9889 bits/instance
Mean absolute error 0.1603
Root mean squared error 0.2721
Relative absolute error 89.0407 %
Root relative squared error 90.6884 %
Total Number of Instances 1797


References:
Use WEKA in your Java Code

No comments:

Post a Comment