opengraphjs - Parse shareable metadata from webpages using the open graph protocol.

  •        1

OpenGraphJS builds a JSON object from a web page which follows the Open Graph Protocol. The JavaScript object returned by this library contains important metadata such as the description, image, and title.

https://github.com/senorcris/opengraphjs

Dependencies:

cheerio : 0.19.0
lodash : 3.6.0
request : 2.55.0

Tags
Implementation
License
Platform

   




Related Projects

micro-open-graph - A tiny Node.js microservice to scrape open graph data with joy.

  •    Javascript

A tiny Node.js microservice to scrape open graph data with joy. The server will then be listening at localhost:3000.

web-scraper-chrome-extension - Web data extraction tool implemented as chrome extension

  •    Javascript

Web Scraper is a chrome browser extension built for data extraction from web pages. Using this extension you can create a plan (sitemap) how a web site should be traversed and what should be extracted. Using these sitemaps the Web Scraper will navigate the site accordingly and extract all data. Scraped data later can be exported as CSV. When submitting a bug please attach an exported sitemap if possible.

scrape-it - :crystal_ball: A Node.js scraper for humans.

  •    Javascript

A Node.js scraper for humans. Please post questions on Stack Overflow. You can open issues with questions, as long you add a link to your Stack Overflow question.

facebook-page-post-scraper - Data scraper for Facebook Pages, and also code accompanying the blog post How to Scrape Data From Facebook Page Posts for Statistical Analysis

  •    Python

UPDATE December 2017: Due to a bug on Facebook's end, using this scraper will only return a very small subset of posts (5-10% of posts) over a limited timeframe. Since Facebook now owns CrowdTangle, the (paid) canonical source of historical Facebook data, Facebook doesn't have an incentive to fix the linked bug. On December 12th, a Facebook engineer commented that they are developing a new endpoint for scraping posts chronologically. I will refactor this script once that happens. Until then, there likely will not be any PRs accepted.

colly - Elegant Scraper and Crawler Framework for Golang

  •    Go

Colly provides a clean interface to write any kind of crawler/scraper/spider. With Colly you can easily extract structured data from websites, which can be used for a wide range of applications, like data mining, data processing or archiving.


scraper - A scraper for EmulationStation written in Go using hashing

  •    Go

An auto-scraper for EmulationStation written in Go using hashes. This currently works with NES, SNES, N64, GB, GBC, GBA, MD, SMS, 32X, GG, PCE, A2600, LNX, MAME/FBA(see below), Dreamcast(bin/gdi), PSX(bin/cue), ScummVM, SegaCD, WonderSwan, WonderSwan Color ROMs. The script works by crawling a directory of ROM files looking for known extensions. When it finds a file it hashes the ROM data minus any headers or special file formatting with the goal of hashing only the data pulled from the original game. It compares this hash to a DB I've compiled to look up the correct game in theGamesDB.net. It downloads the metadata and builds the gamelist.xml file.

app-store-scraper - scrape data from the itunes app store

  •    Javascript

Node.js module to scrape application data from the iTunes/Mac App Store. The goal is to provide an interface as close as possible to the google-play-scraper module.

vue-meta - Manage page meta info in Vue 2.0 components. SSR + Streaming supported.

  •    Javascript

vue-meta is a Vue 2.0 plugin that allows you to manage your app's meta information, much like react-helmet does for React. However, instead of setting your data as props passed to a proprietary component, you simply export it as part of your component's data using the metaInfo property. These properties, when set on a deeply nested component, will cleverly overwrite their parent components' metaInfo, thereby enabling custom info for each top-level view as well as coupling meta info directly to deeply nested subcomponents for more maintainable code.

django-meta - Pluggable app to allow Django developers to quickly add meta tags and OpenGraph, Twitter, and Google Plus properties to their HTML responses

  •    Python

This pluggable app allows Django developers to quickly add meta tags and OpenGraph, Twitter, and Google Plus properties to their HTML responses. django-meta is now maintained by Nephila on github. Old bitbucket repository won't be updated anymore.

HTML Scraper

  •    Java

The HTML Scraper is a utility written in Java which acts as a 'screen scraper' for HTML pages.

scraperjs - A complete and versatile web scraper.

  •    Javascript

Scraperjs is a web scraper module that make scraping the web an easy job. Try to spot the differences.

colly - Fast and Elegant Scraping Framework for Gophers

  •    Go

Colly provides a clean interface to write any kind of crawler/scraper/spider.With Colly you can easily extract structured data from websites, which can be used for a wide range of applications, like data mining, data processing or archiving.

ssu - Server-Side Uploader, the data aggregation engine.

  •    Javascript

SSU is a scripted web site navigator & scraper. It was originally designed and conceived as part of Wesabe's infrastructure and has since been open-sourced. Its original design goal was to extract OFX data given bank usernames and passwords for use on wesabe.com. The system it uses to get this data is XulRunner, a project from Mozilla that provides a customizable (and scriptable) browser. SSU has scripts for each financial institution it supports that describes how to log in and download data from that institution's web site.

meta-tags - Search Engine Optimization (SEO) for Ruby on Rails applications.

  •    Ruby

Search Engine Optimization (SEO) plugin for Ruby on Rails applications. MetaTags master branch fully supports Ruby on Rails 4.2+, and is tested against all major Rails releases up to 5.1.

x-ray - The next web scraper. See through the <html> noise.

  •    Javascript

Looking for a career upgrade? Check out the available Node.js & Javascript positions at these innovative companies.Flexible schema: Supports strings, arrays, arrays of objects, and nested object structures. The schema is not tied to the structure of the page you're scraping, allowing you to pull the data in the structure of your choosing.

market_bot - Google Play Android App store scraper

  •    Ruby

Market Bot is a web scraper (web robot, web spider) for the Google Play Android app store. It can collect data on apps, charts, and developers. Google has recently changed the HTML and CSS for the Play Store. This has caused the release version of Market Bot to break. New code is in the master branch (unreleased) to begin fixing this problem. If you are interesed in helping then please join the discussion in issue 72.

aso - Tools for app store optimization on iTunes and Google Play

  •    Javascript

This Node.js library provides a set of functions to aid App Store Optimization of applications in iTunes and Google Play. The functions use either google-play-scraper or app-store-scraper to gather data, so bear in mind a lot of requests are performed under the hood and you may hit throttling limits when making too many calls in a short period of time.

world_cup_json - Rails backend for a scraper that outputs World Cup data as JSON

  •    Ruby

This should now be working for the World Cup in 2018! Should have all events and goals and match stats streaming live, please file an issue or hit me up on twitter @mutualarising if anything has gone awry. Note: FIFA is now using much more JS that they were 4 years ago to hide and show information. I'll try to make sure as the tournament goes on that things like penalties are showing up correctly. As always, this runs on a scraper. Changes to HTML structure or banning the IP address it is scraping from could break it at any time. PRs welcome.