FRDCSA | git codebases | dragnet

[Project image]

Jump to: Project Description

Project Description

This project was originally inspired by Kohlschütter et al, [Boilerplate Detection using Shallow Text Features](http://www.l3s.de/~kohlschuetter/publications/wsdm187-kohlschuetter.pdf) and Weninger et al [CETR -- Content Extraction with Tag Ratios](http://web.engr.illinois.edu/~weninge1/cetr/), and more recently by [Readability](https://github.com/buriy/python-readability).

This page is part of the FWeb package.
Last updated Sat Oct 26 17:00:01 EDT 2019 .