Review content ExtrActor (REA) ver 1.2

Review content ExtrActor is an open source project that has a novel Review Extraction Algorithm. This algorithm has two steps to discover review layout and efficiently extract this layout.

*Green parts represent important blocks where as red parts represent noisy blocks.

First Step (Learning Stage)

Second Step (Extraction Stage)

Paper of the REA has been published in the Journal of Information Science.

  • Uçar, Erdem; Uzun, Erdinç & Tüfekci, Pınar (2016) “A novel algorithm for extracting the user reviews from web pages”, Journal of Information Science, first published on September 2, 2016 as DOI: 10.1177/0165551516666446