Paper Abstract and Keywords |
Presentation |
2010-02-27 13:30
Difference detection for similar documents based on image matching Yumiko Susuki, Yutaka Nakano, Toshiyuki Yoshida (Fukui Univ.) |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
Some of documents, which have a fixed format and are updated periodically, are often modified very partially, producing very similar documents before and after the modification. This paper aims at an automatic comparison and detection for such modifications in a pair of similar and printed documents. Although the simplest way for identifying such a modification is an application of an OCR system, the recognition ratio of many of current OCR systems is around 97% and is too low to obtain sufficient precision in our comparison application. This paper therefore treats a pair of target documents as images, and proposes an image-based comparison technique by using an image matching and a detection of the longest common sequences. Experimental results given in this paper illustrate that the proposed technique requires several ten seconds for a comparison of a pair of A4 size documents with 1500 Japanese characters, and gives a precision rate of 94% with a recall rate of 100%. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
document processing / character comparison / modification detection / matching / longest common sequence / / / |
Reference Info. |
ITE Tech. Rep., vol. 34, no. 10, ME2010-64, pp. 61-64, Feb. 2010. |
Paper # |
ME2010-64 |
Date of Issue |
2010-02-20 (ME) |
ISSN |
Print edition: ISSN 1342-6893 |
Download PDF |
|
Conference Information |
Committee |
ME |
Conference Date |
2010-02-27 - 2010-02-27 |
Place (in Japanese) |
(See Japanese page) |
Place (in English) |
Kanto Gakuin Univ. |
Topics (in Japanese) |
(See Japanese page) |
Topics (in English) |
|
Paper Information |
Registration To |
ME |
Conference Code |
2010-02-ME |
Language |
Japanese |
Title (in Japanese) |
(See Japanese page) |
Sub Title (in Japanese) |
(See Japanese page) |
Title (in English) |
Difference detection for similar documents based on image matching |
Sub Title (in English) |
|
Keyword(1) |
document processing |
Keyword(2) |
character comparison |
Keyword(3) |
modification detection |
Keyword(4) |
matching |
Keyword(5) |
longest common sequence |
Keyword(6) |
|
Keyword(7) |
|
Keyword(8) |
|
1st Author's Name |
Yumiko Susuki |
1st Author's Affiliation |
University of Fukui (Fukui Univ.) |
2nd Author's Name |
Yutaka Nakano |
2nd Author's Affiliation |
University of Fukui (Fukui Univ.) |
3rd Author's Name |
Toshiyuki Yoshida |
3rd Author's Affiliation |
University of Fukui (Fukui Univ.) |
4th Author's Name |
|
4th Author's Affiliation |
() |
5th Author's Name |
|
5th Author's Affiliation |
() |
6th Author's Name |
|
6th Author's Affiliation |
() |
7th Author's Name |
|
7th Author's Affiliation |
() |
8th Author's Name |
|
8th Author's Affiliation |
() |
9th Author's Name |
|
9th Author's Affiliation |
() |
10th Author's Name |
|
10th Author's Affiliation |
() |
11th Author's Name |
|
11th Author's Affiliation |
() |
12th Author's Name |
|
12th Author's Affiliation |
() |
13th Author's Name |
|
13th Author's Affiliation |
() |
14th Author's Name |
|
14th Author's Affiliation |
() |
15th Author's Name |
|
15th Author's Affiliation |
() |
16th Author's Name |
|
16th Author's Affiliation |
() |
17th Author's Name |
|
17th Author's Affiliation |
() |
18th Author's Name |
|
18th Author's Affiliation |
() |
19th Author's Name |
|
19th Author's Affiliation |
() |
20th Author's Name |
|
20th Author's Affiliation |
() |
Speaker |
Author-1 |
Date Time |
2010-02-27 13:30:00 |
Presentation Time |
15 minutes |
Registration for |
ME |
Paper # |
ME2010-64 |
Volume (vol) |
vol.34 |
Number (no) |
no.10 |
Page |
pp.61-64 |
#Pages |
4 |
Date of Issue |
2010-02-20 (ME) |
|