fotelo: A formatted text loader library

  •        63

fotelo (foe-tell-o): A formatted text loader library. Fotelo will allow you to import text files of various formats into a strongly-typed .NET DataTable for use within your applications.



Related Projects

Vespa - Yahoo's big data serving engine

Vespa is an engine for low-latency computation over large data sets. It stores and indexes your data such that queries, selection and processing over the data can be performed at serving time. Vespa is serving platform for, Yahoo News, Yahoo Sports, Yahoo Finance, Yahoo Gemini, Flickr.

Hazelcast Jet - Distributed data processing engine, built on top of Hazelcast

Hazelcast Jet is a distributed computing platform built for high-performance stream processing and fast batch processing. It embeds Hazelcast In Memory Data Grid (IMDG) to provide a lightweight package of a processor and a scalable in-memory storage. It supports distributed API support for Hazelcast data structures such as IMap and IList, Distributed implementations of java.util.{Queue, Set, List, Map} data structures highly optimized to be used for the processing

Bagri - XML/Document DB on top of distributed cache

Bagri is a Document Database built on top of distributed cache solution like Hazelcast or Coherence. The system allows to process semi-structured schema-less documents and perform distributed queries on them in real-time. It scales horizontally very well with use of data sharding, when all documents are distributed evenly between distributed cache partitions.

Hypertable - A high performance, scalable, distributed storage and processing system for structured

Hypertable is based on Google's Bigtable Design, which is a proven scalable design that powers hundreds of Google services. Many of the current scalable NoSQL database offerings are based on a hash table design which means that the data they manage is not kept physically ordered. Hypertable keeps data physically sorted by a primary key and it is well suited for Analytics.

BoomFilters - Probabilistic data structures for processing continuous, unbounded streams.

Boom Filters are probabilistic data structures for processing continuous, unbounded streams. This includes Stable Bloom Filters, Scalable Bloom Filters, Counting Bloom Filters, Inverse Bloom Filters, Cuckoo Filters, several variants of traditional Bloom filters, HyperLogLog, Count-Min Sketch, and MinHash.Classic Bloom filters generally require a priori knowledge of the data set in order to allocate an appropriately sized bit array. This works well for offline processing, but online processing typically involves unbounded data streams. With enough data, a traditional Bloom filter "fills up", after which it has a false-positive probability of 1.

LibreOffice - The Document foundation

LibreOffice is the free power-packed Open Source personal productivity suite for Windows, Macintosh and Linux. LibreOffice is the perfect choice for home users, businesses, government and other organizations. It's native file format is the ISO standardized ODF (Open Document Format), but LibreOffice can open and save Microsoft Word, PowerPoint and Excel files, as well as many other formats, bringing you the widest-available compatibility with other products.

fsharplu - This library provides a set of F# helpers for string manipulations, logging, collection data structures, file operations, text processing, security, async, parsing, diagnostics, configuration files and Json serialization

This library provides F# lightweight utilities for string manipulations, logging, collection data structures, file operations, text processing, security, async, parsing, diagnostics, configuration files and Json serialization.This is by no means a full-fledged utility library for F#, but rather a small collection of utilities and other thin wrappers accumulated throughout the development of various internal projects at Microsoft and meant to facilitate development with the .Net framework using the F# programming language.

Sensorbee - Lightweight stream processing engine for IoT

Sensorbee is designed for low-latency processing of streaming data at the edge of the network. IoT devices frequently generate large volumes of unstructured streaming data, such as video and audio streams. Even if the data streams are structured, they may be meaningless if their temporal characteristics are not considered. Cloud-based services are generally not good at processing these kinds of data. Preprocessing data streams before they are sent to the cloud makes large scale data processing in the cloud more efficient and reduces the usage of network bandwidth.

webdevstarterkit - Web Developer Starter Kit

- Learn data structures and algorithms. Data structures and algorithmsfor programmers are like notes and scales for a musician. Allprogramming is built around data structures and algorithms. Whethermaking the right choices when solving problems, will be determined byyour confidence in the basics. - **book: [Data Structures and Algorithms][data_struct]** - article: [Data Structures Succinctly: Part 1][datastructspart1]- Learn Unix tools. Unix's philosophy of building simple, small,modular t

lingo - package lingo provides the data structures and algorithms required for natural language processing

package lingo provides the data structures and algorithms required for natural language processing.Specifically, it provides a POS Tagger (lingo/pos), a Dependency Parser (lingo/dep), and a basic tokenizer (lingo/lexer) for English. It also provides data structures for holding corpuses (lingo/corpus), and treebanks (lingo/treebank).

gtfsr - Package for obtaining, validating, viewing, and storing GTFS (transit) data

gtfsr is an R package for easily importing, validating, and mapping transit data that follows the General Transit Feed Specification (GTFS) format.The gtfsr package provides functions for converting files following the GTFS format into a single gtfs data objects. A gtfs object can then be validated for proper data formatting (i.e. if the source data is properly structured and formatted as a GTFS feed) or have any spatial data for stops and routes mapped using leaflet. The gtfsr package also provides API wrappers for the popular public GTFS feed sharing site TransitFeeds, allowing users quick, easy access to hundreds of GTFS feeds from within R.

tif - Text Interchange Formats

This package describes and validates formats for storing common object arising in text analysis as native R objects. Representations of a text corpus, document term matrix, and tokenized text are included. The tokenized text format is extensible to include other annotations. There are two versions of the corpus and tokens objects; packages should accept both and return or coerce to at least one of these.corpus (data frame) - A valid corpus data frame object is a data frame with at least two columns. The first column is called doc_id and is a character vector with UTF-8 encoding. Document ids must be unique. The second column is called text and must also be a character vector in UTF-8 encoding. Each individual document is represented by a single row in the data frame. Addition document-level metadata columns and corpus level attributes are allowed but not required.

typewise - Typewise structured sorting for arbirarily complex data structures

Typewise structured sorting for arbirarily complex data structures

Apache Beam - Unified model for defining both batch and streaming data-parallel processing pipelines

Apache Beam is an open source, unified model for defining both batch and streaming data-parallel processing pipelines. Using one of the open source Beam SDKs, you build a program that defines the pipeline. The pipeline is then executed by one of Beam’s supported distributed processing back-ends, which include Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow.

Sqoop - Transfers data between Hadoop and Datastores

Apache Sqoop is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases. You can use Sqoop to import data from external structured datastores into Hadoop Distributed File System or related systems like Hive and HBase. Conversely, Sqoop can be used to extract data from Hadoop and export it to external structured datastores such as relational databases and enterprise data warehouses.

structured-logging - Write data structures to your logs from clojure

It is built on, but it only works with logback.The second parameter is the structured data you want to log, and any clojure map is ok, as long as Cheshire can handle it.


Scanning Probe Microscopy Controller and Data Visualization Software

Colt - Scientific and Technical Computing in Java

Colt distribution consists of several free Java libraries bundled under one single uniform umbrella. Namely the Colt library, the Jet library, the CoreJava library, and the Concurrent library. It provides support for resizable arrays, dense, sparse matrices, histogramming functionality, Random Number Generators etc.


[Data Structures] Retroactive Data Structures, originally introduced by Prof. Erik Demaine, is a paradigm which can be used to store information about the development of a data structure so that we could easily perform any operation on the host data structure at any point of time. We designed algorithms and data structures for Fully Retroactive BST, Hash and Union-Sameset, and studied existing retroactive data structures for Queue, Deque, Priority Queue and Union-find. It also includes our publi

bubble - A library for transforming data structures into other data structures. Make data portable.

A library for transforming data structures into other data structures. Make data portable.