libabigail
|
Libabigail's core diffing algorithms. More...
Classes | |
class | d_path_vec |
The array containing the furthest D-path end-points, for each value of K. MAX_D is the maximum value of the D-Path. That is, M+N if M is the size of the first input string, and N is the size of the second. More... | |
struct | deep_ptr_eq_functor |
An equality functor to deeply compare pointers. More... | |
struct | default_eq_functor |
The default equality functor used by the core diffing algorithms. More... | |
class | deletion |
The abstraction of the deletion of one element of a sequence A. More... | |
class | edit_script |
The abstraction of an edit script for transforming a sequence A into a sequence B. More... | |
class | insertion |
The abstration of an insertion of elements of a sequence B into a sequence A. This is used to represent the edit script for transforming a sequence A into a sequence B. More... | |
class | point |
A class representing a vertex in an edit graph, as explained in the paper. A vertex is a basically a pair of coordinates (abscissa and ordinate). More... | |
class | snake |
The abstraction of the Snake concept, from the paper. More... | |
Functions | |
template<typename RandomAccessOutputIterator , typename EqualityFunctor > | |
void | compute_diff (RandomAccessOutputIterator a_base, RandomAccessOutputIterator a_begin, RandomAccessOutputIterator a_end, RandomAccessOutputIterator b_base, RandomAccessOutputIterator b_begin, RandomAccessOutputIterator b_end, edit_script &ses) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence. | |
template<typename RandomAccessOutputIterator , typename EqualityFunctor > | |
void | compute_diff (RandomAccessOutputIterator a_base, RandomAccessOutputIterator a_begin, RandomAccessOutputIterator a_end, RandomAccessOutputIterator b_base, RandomAccessOutputIterator b_begin, RandomAccessOutputIterator b_end, vector< point > &lcs, edit_script &ses) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence. | |
template<typename RandomAccessOutputIterator , typename EqualityFunctor > | |
void | compute_diff (RandomAccessOutputIterator a_base, RandomAccessOutputIterator a_begin, RandomAccessOutputIterator a_end, RandomAccessOutputIterator b_base, RandomAccessOutputIterator b_begin, RandomAccessOutputIterator b_end, vector< point > &lcs, edit_script &ses, int &ses_len) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence. | |
template<typename RandomAccessOutputIterator , typename EqualityFunctor > | |
void | compute_diff (RandomAccessOutputIterator a_begin, RandomAccessOutputIterator a_end, RandomAccessOutputIterator b_begin, RandomAccessOutputIterator b_end, edit_script &ses) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence. | |
template<typename RandomAccessOutputIterator > | |
void | compute_diff (RandomAccessOutputIterator a_begin, RandomAccessOutputIterator a_end, RandomAccessOutputIterator b_begin, RandomAccessOutputIterator b_end, edit_script &ses) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence. | |
template<typename RandomAccessOutputIterator , typename EqualityFunctor > | |
void | compute_diff (RandomAccessOutputIterator a_begin, RandomAccessOutputIterator a_end, RandomAccessOutputIterator b_begin, RandomAccessOutputIterator b_end, vector< point > &lcs, edit_script &ses) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence. | |
template<typename RandomAccessOutputIterator > | |
void | compute_diff (RandomAccessOutputIterator a_begin, RandomAccessOutputIterator a_end, RandomAccessOutputIterator b_begin, RandomAccessOutputIterator b_end, vector< point > &lcs, edit_script &ses) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence. | |
template<typename RandomAccessOutputIterator , typename EqualityFunctor > | |
void | compute_diff (RandomAccessOutputIterator a_begin, RandomAccessOutputIterator a_end, RandomAccessOutputIterator b_begin, RandomAccessOutputIterator b_end, vector< point > &lcs, edit_script &ses, int &ses_len) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence. | |
void | compute_lcs (const char *str1, const char *str2, int &ses_len, string &lcs) |
Compute the longest common subsequence of two strings and return the length of the shortest edit script for transforming the first string into the second. | |
bool | compute_middle_snake (const char *str1, const char *str2, snake &s, int &ses_len) |
Returns the middle snake of two strings, as well as the length of their shortest editing script. | |
template<typename RandomAccessOutputIterator , typename EqualityFunctor > | |
bool | compute_middle_snake (RandomAccessOutputIterator a_begin, RandomAccessOutputIterator a_end, RandomAccessOutputIterator b_begin, RandomAccessOutputIterator b_end, snake &snak, int &ses_len) |
Returns the middle snake of two sequences A and B, as well as the length of their shortest editing script. | |
void | compute_ses (const char *str1, const char *str2, edit_script &ses) |
Compute the shortest edit script for transforming a string into another. | |
template<typename RandomAccessOutputIterator > | |
void | display_edit_script (const edit_script &es, const RandomAccessOutputIterator str1_base, const RandomAccessOutputIterator str2_base, ostream &out) |
Display an edit script on standard output. | |
template<typename RandomAccessOutputIterator , typename EqualityFunctor > | |
bool | end_of_fr_d_path_in_k (int k, int d, RandomAccessOutputIterator a_begin, RandomAccessOutputIterator a_end, RandomAccessOutputIterator b_start, RandomAccessOutputIterator b_end, d_path_vec &v, snake &snak) |
Find the end of the furthest reaching d-path on diagonal k, for two sequences. In the paper This is referred to as "the basic
algorithm". | |
template<typename RandomAccessOutputIterator , typename EqualityFunctor > | |
bool | end_of_frr_d_path_in_k_plus_delta (int k, int d, RandomAccessOutputIterator a_begin, RandomAccessOutputIterator a_end, RandomAccessOutputIterator b_begin, RandomAccessOutputIterator b_end, d_path_vec &v, snake &snak) |
Find the end of the furthest reaching reverse d-path on diagonal k + delta. Delta is abs(M - N), with M being the size of a and N being the size of b. This is the "basic algorithm", run backward. That is, starting from the point (M,N) of the edit graph. | |
bool | ends_of_furthest_d_paths_overlap (const point &forward_d_path_end, const point &reverse_d_path_end) |
template<typename RandomAccessOutputIterator > | |
bool | is_match_point (RandomAccessOutputIterator a_begin, RandomAccessOutputIterator a_end, RandomAccessOutputIterator b_begin, RandomAccessOutputIterator b_end, const point &point) |
Tests if a given point is a match point in an edit graph. | |
bool | point_is_valid_in_graph (point &p, unsigned a_size, unsigned b_size) |
template<typename RandomAccessOutputIterator > | |
void | print_snake (RandomAccessOutputIterator a_begin, RandomAccessOutputIterator b_begin, const snake &s, ostream &out) |
This prints the middle snake of two strings. | |
int | ses_len (const char *str1, const char *str2, bool reverse) |
Compute the length of the shortest edit script for two strings. This is done using the "Greedy LCS/SES" of figure 2 in the paper. It can walk the edit graph either foward (when reverse is false) or backward starting from the end (when reverse is true). | |
template<typename RandomAccessOutputIterator , typename EqualityFunctor > | |
int | ses_len (RandomAccessOutputIterator a_begin, RandomAccessOutputIterator a_end, RandomAccessOutputIterator b_begin, RandomAccessOutputIterator b_end, d_path_vec &v, bool reverse) |
Compute the length of the shortest edit script for two sequences a and b. This is done using the "Greedy LCS/SES" of figure 2 in the paper. It can walk the edit graph either foward (when reverse is false) or backward starting from the end (when reverse is true). | |
template<typename RandomAccessOutputIterator > | |
int | ses_len (RandomAccessOutputIterator a_begin, RandomAccessOutputIterator a_end, RandomAccessOutputIterator b_begin, RandomAccessOutputIterator b_end, d_path_vec &v, bool reverse) |
Compute the length of the shortest edit script for two sequences a and b. This is done using the "Greedy LCS/SES" of figure 2 in the paper. It can walk the edit graph either foward (when reverse is false) or backward starting from the end (when reverse is true). | |
bool | snake_end_points (const snake &s, point &x, point &u) |
Get the end points of the snake, as intended by the paper. | |
Libabigail's core diffing algorithms.
This is the namespace defining the core diffing algorithm used by libabigail comparison tools. This based on the diff algorithm from Eugene Myers.
void compute_diff | ( | RandomAccessOutputIterator | a_base, |
RandomAccessOutputIterator | a_begin, | ||
RandomAccessOutputIterator | a_end, | ||
RandomAccessOutputIterator | b_base, | ||
RandomAccessOutputIterator | b_begin, | ||
RandomAccessOutputIterator | b_end, | ||
edit_script & | ses ) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence.
A sequence is determined by a base, a beginning offset and an end offset. The base always points to the container that contains the sequence to consider. The beginning offset is an iterator that points the beginning of the sub-region of the sequence that we actually want to consider. The end offset is an iterator that points to the end of the sub-region of the sequence that we actually want to consider.
This uses the LCS algorithm of the paper at section 4b.
@tparm RandomAccessOutputIterator the type of iterators passed to this function. It must be a random access output iterator kind.
@tparm EqualityFunctor this must be a class that declares a public call operator member returning a boolean and taking two arguments that must be of the same type as the one pointed to by the RandomAccessOutputIterator template parameter. This functor is used to compare the elements referred to by the iterators passed in argument to this function.
a_base | the iterator to the base of the first sequence. |
a_start | an iterator to the beginning of the sub-region of the first sequence to actually consider. |
a_end | an iterator to the end of the sub-region of the first sequence to consider. |
b_base | an iterator to the base of the second sequence to consider. |
b_start | an iterator to the beginning of the sub-region of the second sequence to actually consider. |
b_end | an iterator to the end of the sub-region of the second sequence to actually consider. |
ses | the resulting shortest editing script. |
Definition at line 1908 of file abg-diff-utils.h.
void compute_diff | ( | RandomAccessOutputIterator | a_base, |
RandomAccessOutputIterator | a_begin, | ||
RandomAccessOutputIterator | a_end, | ||
RandomAccessOutputIterator | b_base, | ||
RandomAccessOutputIterator | b_begin, | ||
RandomAccessOutputIterator | b_end, | ||
vector< point > & | lcs, | ||
edit_script & | ses ) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence.
A sequence is determined by a base, a beginning offset and an end offset. The base always points to the container that contains the sequence to consider. The beginning offset is an iterator that points the beginning of the sub-region of the sequence that we actually want to consider. The end offset is an iterator that points to the end of the sub-region of the sequence that we actually want to consider.
This uses the LCS algorithm of the paper at section 4b.
@tparm RandomAccessOutputIterator the type of iterators passed to this function. It must be a random access output iterator kind.
@tparm EqualityFunctor this must be a class that declares a public call operator member returning a boolean and taking two arguments that must be of the same type as the one pointed to by the RandomAccessOutputIterator template parameter. This functor is used to compare the elements referred to by the iterators passed in argument to this function.
a_base | the iterator to the base of the first sequence. |
a_start | an iterator to the beginning of the sub-region of the first sequence to actually consider. |
a_end | an iterator to the end of the sub-region of the first sequence to consider. |
b_base | an iterator to the base of the second sequence to consider. |
b_start | an iterator to the beginning of the sub-region of the second sequence to actually consider. |
b_end | an iterator to the end of the sub-region of the second sequence to actually consider. |
lcs | the resulting lcs. This is set iff the function returns true. |
ses | the resulting shortest editing script. |
Definition at line 1751 of file abg-diff-utils.h.
void compute_diff | ( | RandomAccessOutputIterator | a_base, |
RandomAccessOutputIterator | a_begin, | ||
RandomAccessOutputIterator | a_end, | ||
RandomAccessOutputIterator | b_base, | ||
RandomAccessOutputIterator | b_begin, | ||
RandomAccessOutputIterator | b_end, | ||
vector< point > & | lcs, | ||
edit_script & | ses, | ||
int & | ses_len ) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence.
A sequence is determined by a base, a beginning offset and an end offset. The base always points to the container that contains the sequence to consider. The beginning offset is an iterator that points the beginning of the sub-region of the sequence that we actually want to consider. The end offset is an iterator that points to the end of the sub-region of the sequence that we actually want to consider.
This uses the LCS algorithm of the paper at section 4b.
@tparm RandomAccessOutputIterator the type of iterators passed to this function. It must be a random access output iterator kind.
@tparm EqualityFunctor this must be a class that declares a public call operator member returning a boolean and taking two arguments that must be of the same type as the one pointed to by the RandomAccessOutputIterator template parameter. This functor is used to compare the elements referred to by the iterators passed in argument to this function.
a_base | the iterator to the base of the first sequence. |
a_start | an iterator to the beginning of the sub-region of the first sequence to actually consider. |
a_end | an iterator to the end of the sub-region of the first sequence to consider. |
b_base | an iterator to the base of the second sequence to consider. |
b_start | an iterator to the beginning of the sub-region of the second sequence to actually consider. |
b_end | an iterator to the end of the sub-region of the second sequence to actually consider. |
lcs | the resulting lcs. This is set iff the function returns true. |
ses | the resulting shortest editing script. |
ses_len | the length of the ses above. Normally this can be retrieved from ses.length(), but this parameter is here for sanity check purposes. The function computes the length of the ses in two redundant ways and ensures that both methods lead to the same result. |
Definition at line 1482 of file abg-diff-utils.h.
void compute_diff | ( | RandomAccessOutputIterator | a_begin, |
RandomAccessOutputIterator | a_end, | ||
RandomAccessOutputIterator | b_begin, | ||
RandomAccessOutputIterator | b_end, | ||
edit_script & | ses ) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence.
This uses the LCS algorithm of the paper at section 4b.
@tparm RandomAccessOutputIterator the type of iterators passed to this function. It must be a random access output iterator kind.
@tparm EqualityFunctor this must be a class that declares a public call operator member returning a boolean and taking two arguments that must be of the same type as the one pointed to by the RandomAccessOutputIterator template parameter. This functor is used to compare the elements referred to by the iterators passed in argument to this function.
a_start | an iterator to the beginning of the first sequence to consider. |
a_end | an iterator to the end of the first sequence to consider. |
b_start | an iterator to the beginning of the second sequence to consider. |
b_end | an iterator to the end of the second sequence to consider. |
ses | the resulting shortest editing script. |
Definition at line 1959 of file abg-diff-utils.h.
void compute_diff | ( | RandomAccessOutputIterator | a_begin, |
RandomAccessOutputIterator | a_end, | ||
RandomAccessOutputIterator | b_begin, | ||
RandomAccessOutputIterator | b_end, | ||
edit_script & | ses ) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence.
This uses the LCS algorithm of the paper at section 4b.
@tparm RandomAccessOutputIterator the type of iterators passed to this function. It must be a random access output iterator kind.
a_start | an iterator to the beginning of the first sequence to consider. |
a_end | an iterator to the end of the first sequence to consider. |
b_start | an iterator to the beginning of the second sequence to consider. |
b_end | an iterator to the end of the second sequence to consider. |
ses | the resulting shortest editing script. |
Definition at line 1998 of file abg-diff-utils.h.
void compute_diff | ( | RandomAccessOutputIterator | a_begin, |
RandomAccessOutputIterator | a_end, | ||
RandomAccessOutputIterator | b_begin, | ||
RandomAccessOutputIterator | b_end, | ||
vector< point > & | lcs, | ||
edit_script & | ses ) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence.
This uses the LCS algorithm of the paper at section 4b.
@tparm RandomAccessOutputIterator the type of iterators passed to this function. It must be a random access output iterator kind.
@tparm EqualityFunctor this must be a class that declares a public call operator member returning a boolean and taking two arguments that must be of the same type as the one pointed to by the RandomAccessOutputIterator template parameter. This functor is used to compare the elements referred to by the iterators passed in argument to this function.
a_start | an iterator to the beginning of the first sequence to consider. |
a_end | an iterator to the end of the first sequence to consider. |
b_start | an iterator to the beginning of the sequence to actually consider. |
b_end | an iterator to the end of second sequence to consider. |
lcs | the resulting lcs. This is set iff the function returns true. |
ses | the resulting shortest editing script. |
Definition at line 1806 of file abg-diff-utils.h.
void compute_diff | ( | RandomAccessOutputIterator | a_begin, |
RandomAccessOutputIterator | a_end, | ||
RandomAccessOutputIterator | b_begin, | ||
RandomAccessOutputIterator | b_end, | ||
vector< point > & | lcs, | ||
edit_script & | ses ) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence.
This uses the LCS algorithm of the paper at section 4b.
@tparm RandomAccessOutputIterator the type of iterators passed to this function. It must be a random access output iterator kind.
a_start | an iterator to the beginning of the first sequence to consider. |
a_end | an iterator to the end of the first sequence to consider. |
b_start | an iterator to the beginning of the sequence to actually consider. |
b_end | an iterator to the end of second sequence to consider. |
lcs | the resulting lcs. This is set iff the function returns true. |
ses | the resulting shortest editing script. |
Definition at line 1849 of file abg-diff-utils.h.
void compute_diff | ( | RandomAccessOutputIterator | a_begin, |
RandomAccessOutputIterator | a_end, | ||
RandomAccessOutputIterator | b_begin, | ||
RandomAccessOutputIterator | b_end, | ||
vector< point > & | lcs, | ||
edit_script & | ses, | ||
int & | ses_len ) |
Compute the longest common subsequence of two (sub-regions of) sequences as well as the shortest edit script from transforming the first (sub-region of) sequence into the second (sub-region of) sequence.
This uses the LCS algorithm of the paper at section 4b.
@tparm RandomAccessOutputIterator the type of iterators passed to this function. It must be a random access output iterator kind.
@tparm EqualityFunctor this must be a class that declares a public call operator member returning a boolean and taking two arguments that must be of the same type as the one pointed to by the RandomAccessOutputIterator template parameter. This functor is used to compare the elements referred to by the iterators passed in argument to this function.
a_start | an iterator to the beginning of the first sequence to consider. |
a_end | an iterator to the end of the first sequence to consider. |
b_start | an iterator to the beginning of the second sequence to consider. |
b_end | an iterator to the end of the second sequence to consider. |
lcs | the resulting lcs. This is set iff the function returns true. |
ses | the resulting shortest editing script. |
ses_len | the length of the ses above. Normally this can be retrieved from ses.length(), but this parameter is here for sanity check purposes. The function computes the length of the ses in two redundant ways and ensures that both methods lead to the same result. |
Definition at line 1686 of file abg-diff-utils.h.
void compute_lcs | ( | const char * | str1, |
const char * | str2, | ||
int & | ses_len, | ||
string & | lcs ) |
Compute the longest common subsequence of two strings and return the length of the shortest edit script for transforming the first string into the second.
str1 | the first string to consider. |
str2 | the second string to consider. |
ses_len | the length of the shortest edit script. This is set by the function iff it returns true. |
the | longest common subsequence of the two strings. This is set the function iff it returns true. |
Definition at line 157 of file abg-diff-utils.cc.
bool compute_middle_snake | ( | const char * | str1, |
const char * | str2, | ||
snake & | s, | ||
int & | ses_len ) |
Returns the middle snake of two strings, as well as the length of their shortest editing script.
str1 | the first string to consider. |
str2 | the second string to consider. |
snake_begin | the point representing the beginning |
Definition at line 97 of file abg-diff-utils.cc.
bool compute_middle_snake | ( | RandomAccessOutputIterator | a_begin, |
RandomAccessOutputIterator | a_end, | ||
RandomAccessOutputIterator | b_begin, | ||
RandomAccessOutputIterator | b_end, | ||
snake & | snak, | ||
int & | ses_len ) |
Returns the middle snake of two sequences A and B, as well as the length of their shortest editing script.
This uses the "linear space refinement" algorithm presented in section 4b in the paper. As the paper says, "The idea for doing so is to simultaneously run the basic algorithm in both the forward and reverse directions until furthest reaching forward and reverse paths starting at opposing corners 'overlap'."
@tparm RandomAccessOutputIterator the type of iterators passed to this function. It must be a random access output iterator kind.
@tparm EqualityFunctor this must be a class that declares a public call operator member returning a boolean and taking two arguments that must be of the same type as the one pointed to by the RandomAccessOutputIterator template parameter. This functor is used to compare the elements referred to by the iterators passed in argument to this function.
a_begin | an iterator pointing to the begining of sequence A. |
a_end | an iterator pointing to the end of sequence A. Note that this points right /after/ the end of vector A. |
b_begin | an iterator pointing to the begining of sequence B. |
b_end | an iterator pointing to the end of sequence B. Note that this points right /after/ the end of vector B |
snak | out parameter. This is the snake current when the two paths overlapped. This is set iff the function returns true; otherwise, this is not touched. |
Definition at line 1140 of file abg-diff-utils.h.
void compute_ses | ( | const char * | str1, |
const char * | str2, | ||
edit_script & | ses ) |
Compute the shortest edit script for transforming a string into another.
str1 | the first string to consider. |
str2 | the second string to consider. |
ses | the resulting edit script. |
@pram ses_len the length of the edit script.
Definition at line 187 of file abg-diff-utils.cc.
void display_edit_script | ( | const edit_script & | es, |
const RandomAccessOutputIterator | str1_base, | ||
const RandomAccessOutputIterator | str2_base, | ||
ostream & | out ) |
Display an edit script on standard output.
es | the edit script to display |
str1_base | the first string the edit script is about. |
@pram str2_base the second string the edit script is about.
Definition at line 2024 of file abg-diff-utils.h.
bool end_of_fr_d_path_in_k | ( | int | k, |
int | d, | ||
RandomAccessOutputIterator | a_begin, | ||
RandomAccessOutputIterator | a_end, | ||
RandomAccessOutputIterator | b_start, | ||
RandomAccessOutputIterator | b_end, | ||
d_path_vec & | v, | ||
snake & | snak ) |
Find the end of the furthest reaching d-path on diagonal k, for two sequences. In the paper This is referred to as "the basic algorithm".
Unlike in the paper, the coordinates of the edit graph start at (-1,-1), rather than (0,0), and they end at (M-1, N-1), rather than (M,N).
@tparm RandomAccessOutputIterator the type of iterators passed to this function. It must be a random access output iterator kind.
@tparm EqualityFunctor this must be a class that declares a public call operator member returning a boolean and taking two arguments that must be of the same type as the one pointed to by the RandomAccessOutputIterator template parameter. This functor is used to compare the elements referred to by the iterators passed in argument to this function.
k | the number of the diagonal on which we want to find the end of the furthest reaching D-path. |
d | the D in D-Path. That's the number of insertions/deletions (the number of changes, in other words) in the changeset. That is also the number of non-diagonals in the D-Path. |
a_begin | an iterator to the beginning of the first sequence |
a_end | an iterator that points right after the last element of the second sequence to consider. |
b_begin | an iterator to the beginning of the second sequence. |
b_end | an iterator that points right after the last element of the second sequence to consider. |
v | the vector of furthest end points of d_paths, at (d-1). It contains the abscissas of the furthest end points for different values of k, at (d-1). That is, for k in [-D + 1, -D + 3, -D + 5, ..., D - 1], v[k] is the abscissa of the end of the furthest reaching (D-1)-path on diagonal k. |
snak | the last snake of the furthest path found. The end point of the snake is the end point of the furthest path. |
Definition at line 841 of file abg-diff-utils.h.
bool end_of_frr_d_path_in_k_plus_delta | ( | int | k, |
int | d, | ||
RandomAccessOutputIterator | a_begin, | ||
RandomAccessOutputIterator | a_end, | ||
RandomAccessOutputIterator | b_begin, | ||
RandomAccessOutputIterator | b_end, | ||
d_path_vec & | v, | ||
snake & | snak ) |
Find the end of the furthest reaching reverse d-path on diagonal k + delta. Delta is abs(M - N), with M being the size of a and N being the size of b. This is the "basic algorithm", run backward. That is, starting from the point (M,N) of the edit graph.
Unlike in the paper, the coordinates of the edit graph start at (-1,-1), rather than (0,0), and they end at (M-1, N-1), rather than (M,N).
@tparm RandomAccessOutputIterator the type of iterators passed to this function. It must be a random access output iterator kind.
@tparm EqualityFunctor this must be a class that declares a public call operator member returning a boolean and taking two arguments that must be of the same type as the one pointed to by the RandomAccessOutputIterator template parameter. This functor is used to compare the elements referred to by the iterators passed in argument to this function.
k | the number of the diagonal on which we want to find the end of the furthest reaching reverse D-path. Actually, we want to find the end of the furthest reaching reverse D-path on diagonal (k
|
d | the D in D-path. That's the number of insertions/deletions (the number of changes, in other words) in the changeset. That is also the number of non-diagonals in the D-Path. |
a_begin | an iterator to the beginning of the first sequence |
a_end | an iterator that points right after the last element of the second sequence to consider. |
b_begin | an iterator to the beginning of the second sequence. |
b_end | an iterator that points right after the last element of the second sequence to consider. |
v | the vector of furthest end points of d_paths, at (d-1). It contains the abscissae of the furthest end points for different values of k - delta, at (d-1). That is, for k in [-D + 1, -D + 3, -D + 5, ..., D - 1], v[k - delta] is the abscissa of the end of the furthest reaching (D-1)-path on diagonal k - delta. |
snak | the last snake of the furthest path found. The end point of the snake is the end point of the furthest path. |
Definition at line 982 of file abg-diff-utils.h.
bool ends_of_furthest_d_paths_overlap | ( | const point & | forward_d_path_end, |
const point & | reverse_d_path_end ) |
forward_d_path_end |
Definition at line 32 of file abg-diff-utils.cc.
bool is_match_point | ( | RandomAccessOutputIterator | a_begin, |
RandomAccessOutputIterator | a_end, | ||
RandomAccessOutputIterator | b_begin, | ||
RandomAccessOutputIterator | b_end, | ||
const point & | point ) |
Tests if a given point is a match point in an edit graph.
a_begin | the begin iterator of the first input sequence of the edit graph. |
a_end | the end iterator of the first input sequence of the edit graph. This points to one element passed the end of the sequence. |
b_begin | the begin iterator of the second input sequence of the edit graph. |
b_end | the end iterator of the second input sequence of the edit graph. This points the one element passed the end of the sequence. |
point | the point to test for being a match point. |
Definition at line 1086 of file abg-diff-utils.h.
bool point_is_valid_in_graph | ( | point & | p, |
unsigned | a_size, | ||
unsigned | b_size ) |
p | the point to test. |
a_size | the size of abscissas. That is, the size of the first input to consider. |
b_size | the of ordinates. That is, the size of the second input to consider. |
Definition at line 52 of file abg-diff-utils.cc.
void print_snake | ( | RandomAccessOutputIterator | a_begin, |
RandomAccessOutputIterator | b_begin, | ||
const snake & | s, | ||
ostream & | out ) |
This prints the middle snake of two strings.
a_begin | the beginning of the first string. |
b_begin | the beginning of the second string. |
s | the snake to print. |
out | the output stream to print the snake to. |
Definition at line 1266 of file abg-diff-utils.h.
int ses_len | ( | const char * | str1, |
const char * | str2, | ||
bool | reverse ) |
Compute the length of the shortest edit script for two strings. This is done using the "Greedy LCS/SES" of figure 2 in the paper. It can walk the edit graph either foward (when reverse is false) or backward starting from the end (when reverse is true).
str1 | the first string to consider. |
str2 | the second string to consider. |
reverse | when true, walk the edit graph backward, starting from the point (M,N) (with M and N being the length of str1 and str2). When false, walk the edit graph forward, starting from the point (0,0). |
Definition at line 128 of file abg-diff-utils.cc.
int ses_len | ( | RandomAccessOutputIterator | a_begin, |
RandomAccessOutputIterator | a_end, | ||
RandomAccessOutputIterator | b_begin, | ||
RandomAccessOutputIterator | b_end, | ||
d_path_vec & | v, | ||
bool | reverse ) |
Compute the length of the shortest edit script for two sequences a and b. This is done using the "Greedy LCS/SES" of figure 2 in the paper. It can walk the edit graph either foward (when reverse is false) or backward starting from the end (when reverse is true).
Here, note that the real content of a and b should start at index 1, for this implementatikon algorithm to match the paper's algorithm in a straightforward manner. So pleast make sure that at index 0, we just get some non-used value.
@tparm RandomAccessOutputIterator the type of iterators passed to this function. It must be a random access output iterator kind.
@tparm EqualityFunctor this must be a class that declares a public call operator member returning a boolean and taking two arguments that must be of the same type as the one pointed to by the RandomAccessOutputIterator template parameter. This functor is used to compare the elements referred to by the iterators passed in argument to this function.
a | the first sequence we care about. |
b | the second sequence we care about. |
v | the vector that contains the end points of the furthest reaching d-path and (d-1)-path. |
Definition at line 1323 of file abg-diff-utils.h.
int ses_len | ( | RandomAccessOutputIterator | a_begin, |
RandomAccessOutputIterator | a_end, | ||
RandomAccessOutputIterator | b_begin, | ||
RandomAccessOutputIterator | b_end, | ||
d_path_vec & | v, | ||
bool | reverse ) |
Compute the length of the shortest edit script for two sequences a and b. This is done using the "Greedy LCS/SES" of figure 2 in the paper. It can walk the edit graph either foward (when reverse is false) or backward starting from the end (when reverse is true).
Here, note that the real content of a and b should start at index 1, for this implementatikon algorithm to match the paper's algorithm in a straightforward manner. So pleast make sure that at index 0, we just get some non-used value.
Note that the equality operator used to compare the elements passed in argument to this function is the default "==" operator.
@tparm RandomAccessOutputIterator the type of iterators passed to this function. It must be a random access output iterator kind.
a | the first sequence we care about. |
b | the second sequence we care about. |
v | the vector that contains the end points of the furthest reaching d-path and (d-1)-path. |
Definition at line 1406 of file abg-diff-utils.h.
Get the end points of the snake, as intended by the paper.
s | the snake to consider. |
x | the starting point of the snake. |
u | the end point of the snake. |
Definition at line 70 of file abg-diff-utils.cc.