Skip to content

DOMHarvestPlaywright-powered web scraping

Extract DOM elements with precision and speed

Quick Example โ€‹

javascript
import { harvest } from 'domharvest-playwright'

// Extract quotes from quotes.toscrape.com (a site designed for scraping practice)
const quotes = await harvest(
  'https://quotes.toscrape.com/',
  '.quote',
  (el) => ({
    text: el.querySelector('.text')?.textContent?.trim(),
    author: el.querySelector('.author')?.textContent?.trim(),
    tags: Array.from(el.querySelectorAll('.tag')).map(tag => tag.textContent?.trim())
  })
)

console.log(quotes)
// Output: Array of 10 quotes with authors and tags

Why DOMHarvest? โ€‹

DOMHarvest makes web scraping simple and reliable by leveraging Playwright's battle-tested browser automation. Whether you're building a data pipeline, monitoring websites, or extracting content for analysis, DOMHarvest provides the tools you need with minimal setup.

Features at a Glance โ€‹

  • Easy to use: Simple API for common scraping tasks
  • Powerful: Access to full Playwright capabilities when needed
  • Flexible: Support for both simple selectors and custom extraction logic
  • Standard compliant: Follows JavaScript Standard Style
  • Well documented: Comprehensive guides and API documentation

Released under the MIT License.