Package: boilerpipeR
Version: 1.3.2
Date: 2021-05-19
Title: Interface to the Boilerpipe Java Library
Author: See AUTHORS file.
Maintainer: Mario Annau <mario.annau@gmail.com>
Imports: rJava
Suggests: RCurl
Description: Generic Extraction of main text content from HTML files; removal
    of ads, sidebars and headers using the boilerpipe 
    <https://github.com/kohlschutter/boilerpipe> Java library. The
    extraction heuristics from boilerpipe show a robust performance for a wide
    range of web site templates.
License: Apache License (== 2.0)
URL: https://github.com/mannau/boilerpipeR
BugReports: https://github.com/mannau/boilerpipeR/issues
RoxygenNote: 7.1.1
Encoding: UTF-8
NeedsCompilation: no
Packaged: 2021-05-19 09:05:37 UTC; marioannau
Repository: CRAN
Date/Publication: 2021-05-19 09:20:02 UTC
Built: R 4.2.0; ; 2023-04-01 10:58:50 UTC; unix
