Calculate the secondary values of the feature.
If there is only one reference feature for a stroke, the secondary values are just
calculated and store like this.
But, if there are more reference features for one single character, the secondary values
are set to the avarage.
Reduce the amount of captured points.
Even after some captured points are reduced, the quality of a picture showing the
captured feature is still pretty good. However, a feature can only be recalculated with
other parameters if all the originally captured points are available
Set the secondary values (the values that can be derived from others) to the avarage
of the values of the features in the collection.
If there is only one reference feature for a stroke, the secondary values are just
derived.
But, if there are more reference features for one single character, the secondary values
are set to the avarage using this method
Set all the values except the direction vectors to the avarage values of the features
in the collection.
This is used if there is more than one reference feature for a character
Update the maximum stroke-distance. This maximum stroke-distance is used to normalize
the absolute stroke-distance between two features.
The value has to be divided by strokeSize because there are only insert/remove operations
and no substitutions. According to the stroke-distance algorithm, successive insert/remove
costs are multiplied with the number of successive costs.
NOTE: This method should perhaps be changed when the method
sameClassAbsoluteStrokeDistance:forReference: is changed
This is basically the implementation in Object.
For backwards compatibility, IntegerArrays and PointArrays are stored as Arrays and the
instance variable sizeOrSquaredLengths (last variable!!) is not stored
This method calculates several key-values describing the acute angles of the
feature. In fact, it calculates the sum of the acute angles (weighted by heuristics)
and the avarage position of the acute angles within the stroke.
Acute ngle in this context mean angles, that are either very big or that don't have
angles with the same sign in their neighbourhood
Calculate the sum of the negative angles. The negative angles are only fully counted
up to a value of 135 degrees. This is important because angles around 180 degrees can
be positive or negative without significantly changing the stroke.
Calculate the sum of the positive angles. The positive angles are only fully counted
up to a value of 135 degrees. This is important because angles around 180 degrees can
be positive or negative without significantly changing the stroke.
Calculate the sum of the positive angles in angleCollection.
The angles are only fuly counted if they are smaller than maxAngleNumber.
Angles at the start or the end of the stroke are only fully counted if the start/end
line is long enough. (There is often noise at the start or at the end)
Calculate the sqared length and weight them by a heuristic functions.
This function makes long lengths even longer. This is nice because eliminating
long distances in a edit distance algorithm must be very expensive!
This is a heuristic method.
When a vector of one feature should be matched with a vector of another feature, the
global angle difference is an important criteria.
In the current implementation, the matching costs are first calculated independent of the
gobal angle difference. But then, they are multiplied with this factor and then diveded by 16.
(Actually, this factor is 16 times more than it should be to avoid floating point ops).
At the start/end of a stroke there is often a very short vector in a completely wrong
direction before the real stroke starts/ends. Therefore, there are often quite big angles
that are just noise. This method reduces this noise by weighting typical 'start/end noise
angles' with a small value
Calculate a distance measuring the costs of matching all the direction vectors of the two
features. The used algorithm is known as an algorithm to compute the edit distance
between two strings or DNA-sequences. Basically it iterates through all the direction
vectors of the two features and tries to make them identical. To achieve this goal, the
algorithm can insert, replace or substitute direction vectors. Each of these 3 operations
has costs assigned to it and the algorithm finds the sequence of them that generates
the minimum costs. The alghorithm has a time usage of n * m (if n and m are the numbers
of vectors in the two features).
The main part of this implementation is not the basic algorithm, but th heuristics to
calculate the costs for the operations. Besides the direction vectors, also the points
where the vector starts / ends, the global angles of the vectors, the number of successive
insert/remove operations without an intermediate substitution, etc.
If aBoolean is true, the algorithm doesn't add special additional costs for successive insert/
remove ops. Further their is no substitution possible. This mode is used to calculate the
distance to a reference feature that consists only of one point. Thus, this distance is an
approximation of the costs that are used to build the feature from scratch. When th
stroke-distance costs have to be normalized, this specieal reference costs are considered
as the maximum distance
NOTE: If this method is changed, the method updateStrokeDistance should perhaps also be
changed
Calculate a distance measuring the costs of matching all the direction vectors of the two
features. The used algorithm is known as an algorithm to compute the edit distance
between two strings or DNA-sequences. Basically it iterates through all the direction
vectors of the two features and tries to make them identical. To achieve this goal, the
algorithm can insert, replace or substitute direction vectors. Each of these 3 operations
has costs assigned to it and the algorithm finds the sequence of them that generates
the minimum costs. The alghorithm has a time usage of n * m (if n and m are the numbers
of vectors in the two features).
The main part of this implementation is not the basic algorithm, but the heuristics to
calculate the costs for the operations. Besides the direction vectors, also the points
where the vector starts / ends, the global angles of the vectors, the number of successive
insert/remove operations without an intermediate substitution, etc.
If aBoolean is true, the algorithm doesn't add special additional costs for successive insert/
remove ops. Further their is no substitution possible. This mode is used to calculate the
distance to a reference feature that consists only of one point. Thus, this distance is an
approximation of the costs that are used to build the feature from scratch. When th
stroke-distance costs have to be normalized, this specieal reference costs are considered
as the maximum distance
NOTES:
If this method is changed, the method updateStrokeDistance should perhaps also be changed.
The GeniePlugin has to mirror the algorithm here!
Calculate a distance measuring the costs of matching all the direction vectors of the two
features. The used algorithm is known as an algorithm to compute the edit distance
between two strings or DNA-sequences. Basically it iterates through all the direction
vectors of the two features and tries to make them identical. To achieve this goal, the
algorithm can insert, replace or substitute direction vectors. Each of these 3 operations
has costs assigned to it and the algorithm finds the sequence of them that generates
the minimum costs. The alghorithm has a time usage of n * m (if n and m are the numbers
of vectors in the two features).
The main part of this implementation is not the basic algorithm, but the heuristics to
calculate the costs for the operations. Besides the direction vectors, also the points
where the vector starts / ends, the global angles of the vectors, the number of successive
insert/remove operations without an intermediate substitution, etc.
If aBoolean is true, the algorithm doesn't add special additional costs for successive insert/
remove ops. Further their is no substitution possible. This mode is used to calculate the
distance to a reference feature that consists only of one point. Thus, this distance is an
approximation of the costs that are used to build the feature from scratch. When th
stroke-distance costs have to be normalized, this specieal reference costs are considered
as the maximum distance
NOTES:
If this method is changed, the method updateStrokeDistance should perhaps also be changed.
The GeniePlugin has to mirror the algorithm here!
Calculate an asymmetric distance measuring the costs of matching all the direction
vectors of the two features.
Basically this matching operation is completely symmetric. But the result of the operation
is just a number that must be normalized somehow. (At the end the distance has to be in the
interval [0, CRFeature maxNormDistance]).
The problem is that the not yet normalized distance value can vary a lot depending on the
number of direction vectors in the two features. That's the reason why the normalization
process can't be done independent of the two currently compared features!
Usually one feature is compared to all the features in a dictionary. To be fair, the
normalization process has to be the same for one dictionary lookup and thus this
process can only use properties of the feature that is looked up. As a consequence,
this distance is not symmetric ((A dist: B) ~= (B dist: A)).
Whenever a new feature is looked up in a dictionary, the asymmetric stroke distance
is used.
For all the distances between features inside the dictionary, the symmetric stroke distance is
used. (It would be really confusing if the distance from 'a' to 'b' would be different than the
one from 'b' to 'a'!)
Calculate a symmetric distance measuring the costs of matching all the direction
vectors of the two features.
Basically this matching operation is completely symmetric. But the result of the operation
is just a number that must be normalized somehow. (At the end the distance has to be in the
interval [0, CRFeature maxNormDistance]).
The problem is that the not yet normalized distance value can vary a lot depending on the
number of direction vectors in the two features. That's the reason why the normalization
process can't be done independent of the two currently compared features!
Usually one feature is compared to all the features in a dictionary. To be fair, the
normalization process has to be the same for one dictionary lookup and thus this
process can only use properties of the feature that is looked up. As a consequence,
this distance is not symmetric ((A dist: B) ~= (B dist: A)).
Whenever a new feature is looked up in a dictionary, the asymmetric stroke distance
is used.
For all the distances between features inside the dictionary, the symmetric stroke distance is
used. (It would be really confusing if the distance from 'a' to 'b' would be different than the
one from 'b' to 'a'!)
Arguments:
primBlock: Block that takes two CRStrokeFeatures as arguments and compares them using the
new primitive method.
origBlock: Block that takes two CRStrokeFeatures as arguments and compares them using the
original method.
aCRDictionary: CRDictionary containing CRFeatures that should be used in the comparing
methods.
aNumber: The algorithm raises an exception if the relative difference between the
original and the primitive method are bigger than that.
Transcript output has the form:
(index of feature1) - (index of feature2): (primitive), (original), (abs diff), (rel diff)