Understanding plagiarism linguistic patterns, textual features, and detection methods

dc.contributor.authorAlzahrani, Salha M.
dc.contributor.authorSalim, Naomie
dc.contributor.authorAbraham, Ajith
dc.date.accessioned2012-03-20T08:49:41Z
dc.date.available2012-03-20T08:49:41Z
dc.date.issued2012
dc.description.abstractPlagiarism can be of many different natures, ranging from copying texts to adopting ideas, without giving credit to its originator. This paper presents a new taxonomy of plagiarism that highlights differences between literal plagiarism and intelligent plagiarism, from the plagiarist's behavioral point of view. The taxonomy supports deep understanding of different linguistic patterns in committing plagiarism, for example, changing texts into semantically equivalent but with different words and organization, shortening texts with concept generalization and specification, and adopting ideas and important contributions of others. Different textual features that characterize different plagiarism types are discussed. Systematic frameworks and methods of monolingual, extrinsic, intrinsic, and cross-lingual plagiarism detection are surveyed and correlated with plagiarism types, which are listed in the taxonomy. We conduct extensive study of state-of-the-art techniques for plagiarism detection, including character n-gram-based (CNG), vector-based (VEC), syntax-based (SYN), semantic-based (SEM), fuzzy-based (FUZZY), structural-based (STRUC), stylometric-based (STYLE), and cross-lingual techniques (CROSS). Our study corroborates that existing systems for plagiarism detection focus on copying text but fail to detect intelligent plagiarism when ideas are presented in different words.cs
dc.description.firstpage133cs
dc.description.issue2cs
dc.description.lastpage149cs
dc.description.sourceWeb of Sciencecs
dc.description.volume42cs
dc.identifier.citationIEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews). 2012, vol. 42, issue 2, p. 133-149.cs
dc.identifier.doi10.1109/TSMCC.2011.2134847
dc.identifier.issn1094-6977
dc.identifier.locationNení ve fondu ÚKcs
dc.identifier.urihttp://hdl.handle.net/10084/90243
dc.identifier.wos000300511400001
dc.language.isoencs
dc.publisherIEEE Systems, Man, and Cybernetics Societycs
dc.relation.ispartofseriesIEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)cs
dc.relation.urihttps://doi.org/10.1109/TSMCC.2011.2134847cs
dc.titleUnderstanding plagiarism linguistic patterns, textual features, and detection methodscs
dc.typearticlecs
dc.type.statusPeer-reviewedcs

Files

License bundle

Now showing 1 - 1 out of 1 results
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: