Paulson introduces PaleoSketch, a system for recognizing primitive shapes. A primitive shape is a simple shape drawn with one stroke, many of which cannot easily be broken down into simpler shapes. Paleo uses lots of rules to classify shapes as one of:
- Line
- Polyline
- Arc
- Circle
- Ellipse
- Helix
- Spiral
- Curve
Paleo recognizes more shapes than the previous work (Sezgin) and also features a much higher accuracy.
PaleoSketch is an important tool, because we have it now and it does a pretty decent job of solving an important problem. That being said, I don't like it one bit.
Paleo (as described in the paper) has 26 parameters. 26! That's a lot. And they're all "empirically determined", which means hand-tuned. If you add a new primitive shape, do you have to re-tune all 26 parameters plus the new ones you're bound to introduce if you extend the approach? If you turn off some primitives, are you working with an unoptimized parameter set? If the parameters aren't learned, there's no way for the end user to retrain them.
Next we come to the hierarchy. I have so many questions here. Is it overfit to the particular training case (single primitive shapes drawn in isolation)? I wrote a new primitive fit, okay, now where do I put it in the hierarchy? Or maybe I turned off some primitive fits, is there a better hierarchy for my new set? Again, how can I customize this to my particular needs without some painstaking, manual process?
Now the reason for the hierarchy: a line fit is not comparable to an arc fit is not comparable to a helix fit. If you can't compare fits to each other, can you possibly give me a reasonable estimate of a confidence for the fit? Let's consider the curvey polyline case, maybe I want to know it's a curve that's also pretty darn close to a polyline. If all Paleo can tell me is that it's more curve than polyline, how do I know if polyline is even a reasonable runner-up? Until you can boast 100% always, I'd really like to have a reasonable estimate of your confidence (and if you have that for all your fits, it would seem they're comparable to each other).
Sure, it's obvious that Paleo runs in real time. But how much time does it leave for a high-level recognizer to also run and still call it "real time"? Calculating individual fits certainly seems like an extensible approach, but putting all of the features through a classifier (linear, quadratic, SVM, neural net, decision tree, whatever) would probably be faster. I think the extensibility argument is moot given the implications of parameter tuning and fitting new fits into the hierarchy. All that's left is to wonder which does a better job, accuracy-wise.