node-office - Parse office documents (doc, docx, xls, etc..)

  •        213

Parse office documents (doc, docx, xls, etc..)

https://github.com/dkiyatkin/node-office

Dependencies:

xml2json : ~0.3.2
temp : ~0.6.0

Tags
Implementation
License
Platform

   




Related Projects

ONLYOFFICE Desktop Editors - An office suite that combines text, spreadsheet and presentation editors allowing to create, view and edit local documents

  •    C

ONLYOFFICE Desktop Editors is a free and open source office suite comprises text documents, spreadsheets and presentations allowing to create, view and edit documents of any size and complexity, to easily switch to the online mode for real-time co-editing and collaboration. Features as reviewing, commenting and chat are available as well. Deal with multiple files within one and the same window thanks to the tab-based user interface

textract - node

  •    HTML

A text extraction node module. In almost all cases above, what textract cares about is the mime type. So .html and .htm, both possessing the same mime type, will be extracted. Other extensions that share mime types with those above should also extract successfully. For example, application/vnd.ms-excel is the mime type for .xls, but also for 5 other file types.

react-native-doc-viewer - React Native Doc Viewer (Supports file formats: xls,ppt,doc,xlsx,pptx,csv,docx,png,jpg,pdf,xml,binary

  •    Objective-C

React Native Native Module Bridge Quicklock Document Viewer for IOS + Android supports pdf, png, jpg, xls, ppt, doc, docx, pptx, xlx + Video Player mp4 supported

MOSS Document Converter

  •    

Microsoft Office SharePoint Server (MOSS) Document Converters with Word & Excel 2007 on the server. Converting Office 2003 file-types (doc, xls) to pdf and xps. Could easily be altered for work for docx and xlsx file-types. Desktop Automation on the Server: Previously, us...

docx4j - JAXB-based Java library for Word docx, Powerpoint pptx, and Excel xlsx files

  •    Java

docx4j is a library which helps you to work with the Office OpenXML file format as used in docx documents, pptx presentations, and xlsx spreadsheets.


pyexcel - Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files

  •    Python

If your company has embedded pyexcel and its components into a revenue generating product, please support me on patreon or bounty source to maintain the project and develop it further. If you are an individual, you are welcome to support me too and for however long you feel like. As my backer, you will receive early access to pyexcel related contents.

PHPWord - A pure PHP library for reading and writing word processing documents

  •    PHP

PHPWord is a library written in pure PHP that provides a set of classes to write to and read from different document file formats. The current version of PHPWord supports Microsoft Office Open XML (OOXML or OpenXML), OASIS Open Document Format for Office Applications (OpenDocument or ODF), Rich Text Format (RTF), HTML, and PDF. PHPWord is an open source project licensed under the terms of LGPL version 3. PHPWord is aimed to be a high quality software product by incorporating continuous integration and unit testing. You can learn more about PHPWord by reading the Developers' Documentation.

Binary(doc,xls,ppt) to OpenXMLTranslator

  •    CSharp

The main goal of the Office Binary (doc, xls, ppt) Translator to Open XML Project is to create software tools, plus guidance, showing how a document written using the Binary Formats (doc, xls, ppt) can be translated to Office Open XML.

PowerMeta - PowerMeta searches for publicly available files hosted on various websites for a particular domain by using specially crafted Google, and Bing searches

  •    PowerShell

PowerMeta searches for publicly available files hosted on various websites for a particular domain by using specially crafted Google, and Bing searches. It then allows for the download of those files from the target domain. After retrieving the files, the metadata associated with them can be analyzed by PowerMeta. Some interesting things commonly found in metadata are usernames, domains, software titles, and computer names. For many organizations it's common to find publicly available files posted on their external websites. Many times these files contain sensitive information that might be of benefit to an attacker like usernames, domains, software titles or computer names. PowerMeta searches both Bing and Google for files on a particular domain using search strings like "site:targetdomain.com filetype:pdf". By default it searches for "pdf, docx, xlsx, doc, xls, pptx, and ppt".

docconv - Converts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text

  •    Go

A Go wrapper library to convert PDF, DOC, DOCX, XML, HTML, RTF, ODT, Pages documents and images (see optional dependencies below) to plain text. Note for returning users: the Go code path for this pkg been moved to code.sajari.com/docconv. Follow the installation instructions to checkout a version of the code in the correct place.

TableExport - The simple, easy-to-implement library to export HTML tables to xlsx, xls, csv, and txt files

  •    Javascript

In order to provide Office Open XML SpreadsheetML Format ( .xlsx ) support, you must include the following third-party library in your project before both FileSaver.js and TableExport. To support legacy browsers ( Chrome < 20, Firefox < 13, Opera < 12.10, IE < 10, Safari < 6 ) include the Blob.js polyfill before the FileSaver.js script.

Simple OOXML

  •    

Simple OOXML makes the creation of Open Office XML documents easier for developers. Modify or create any .docx or .xlsx document without Microsoft Word or Microsoft Excel. Uses the Open Office SDK v 2.0.

OfficeHelper

  •    DotNet

Wrapper around the open xml office package. You can easily generate xlsx documents based on a template xlsx document and reuse parts from that document, if you mark them as named ranges (i.e."names"). Requirement: .Net 3.5 or later. Microsoft Office does not need to be installed!

readxl - Read excel files (.xls and .xlsx) into R 🖇

  •    C++

The readxl package makes it easy to get data out of Excel and into R. Compared to many of the existing packages (e.g. gdata, xlsx, xlsReadWrite) readxl has no external dependencies, so it’s easy to install and use on all operating systems. It is designed to work with tabular data. readxl supports both the legacy .xls format and the modern xml-based .xlsx format. The libxls C library is used to support .xls, which abstracts away many of the complexities of the underlying binary format. To parse .xlsx, we use the RapidXML C++ library.

spout - Read and write spreadsheet files (CSV, XLSX and ODS), in a fast and scalable way

  •    PHP

Spout is a PHP library to read and write spreadsheet files (CSV, XLSX and ODS), in a fast and scalable way. Contrary to other file readers or writers, it is capable of processing very large files while keeping the memory usage really low (less than 3MB).Full documentation can be found at http://opensource.box.com/spout/.

spreadsheet_architect - Spreadsheet Architect is a library that allows you to create XLSX, ODS, or CSV spreadsheets super easily from ActiveRecord relations, plain Ruby objects, or tabular data

  •    Ruby

Spreadsheet Architect is a library that allows you to create XLSX, ODS, or CSV spreadsheets super easily from ActiveRecord relations, plain Ruby objects, or tabular data. When NOT using the :data option, ie. on an AR Relation or using the :instances option, Spreadsheet Architect requires an instance method defined on the class to generate the data. It looks for the spreadsheet_columns method on the class. If you are using on an ActiveRecord model and that method is not defined, it would fallback to the models column_names method (not recommended). If using the :data option this is ignored.

Office Open XML for C++

  •    C++

Office Open XML for C++ is a project to create an API for working with OOXML documents such as docx, pptx, xlsx and any other files that conform to the Open Packaging Conventions in c++

PhpSpreadsheet - A pure PHP library for reading and writing spreadsheet files

  •    PHP

PhpSpreadsheet is a library written in pure PHP and providing a set of classes that allow you to read from and to write to different spreadsheet file formats, like Excel and LibreOffice Calc. Read more about it, including install instructions, in the official documentation. Or check out the API documentation.

node-xlsx - NodeJS excel file parser & builder

  •    Javascript

Excel file parser/builder that relies on js-xlsx. This library requires at lease nodeJS v4. For legacy versions, you can use this workaround before using the lib.