Version 1.8 - 2019-10-18 * Stop truncating article IDs to avoid duplicates (thanks to Tom Nicholls). * Handle non-numeric 'page' metadata entries (thanks to Tom Nicholls). Version 1.7 - 2017-11-06 * Port from XML to xml2 package to support tm 0.8. Version 1.6 - 2017-02-08 * Avoid importing each article twice with new Factiva HTML format. * Add screencast showing how to export correct HTML files in ?FactivaSource. Version 1.5 - 2014-07-05 * Fix encoding issues on non-UTF-8 systems, adding back the 'encoding' argument to work around a bug in package XML. Version 1.4 - 2014-05-31 * Adapt to tm 0.6. * Remove the 'encoding' argument to FactivaSource() as it is not supported by tm 0.6 (normally not needed). * Change all tags to lowercase (for consistency with tm). * Ensure meta-data variables which are supposed to contain only one value always do so. Version 1.3 - 2014-01-10 * Extract Company, Industry, Information Provider Code (IPC) and Information Provider Description (IPD) meta-data (based on a patch by Grigorij Ljubownikow). * Remove inconsistent line breaks in HTML format. * Update to support tm 0.5-10 and clean the code a bit. Version 1.2 - 2013-01-28 * Extract Subject and Coverage meta-data. * Add Reuters21578 example. * Fix handling of articles with no header or body. * Split lead paragraphs into separate lines. * Fix package help page to mention HTML. Version 1.1 - 2012-06-30 * Add support for HTML files since Factiva no longer allows exporting to XML. * Work around encoding issues on Windows (for HTML only). * Preserve paragraphs information so that e.g. makeChunks() from tm can be used to split documents into smaller pieces. Version 1.0 - 2012-05-14 * Initial release with support for XML files.