Extracting the data

How is data extracted?

When a monitor page is loaded, Monitoro uses the selection you defined when creating the monitor, and waits for all your selections to be present before extracting data.

For each item in the selection, we use its CSS selector to locate the corresponding part in the page, and then extract:

  • The text content

  • The HTML

  • Each HTML attribute parsed as an array of names and values (for example the src attribute of an image, or the href attribute of a link).

What does extracted data look like?

The data is a JSON object that contains the monitored URL, and an entry for each selection containing the original selector, as well as the text, HTML and HTML attributes.

A simple monitor that watches the price of a product in an e-commerce website could have extracted data like this:

{
"url": "https://www.monitored-website.com/some/page",
"entries": {
"title": {
"html": "<h1>VR Headset</h1>",
"text": "VR Headset",
"selector": "h1"
"attributes": []
},
"price": {
"html": "<span id="price-value" class="featured">45 €</span>",
"text": "450€",
"selector": "#price-value"
"attributes": [{
name: "class",
value: "featured"
}]
}
}
}

What is a CSS selector?

Selectors are at the core of scraping. It's a way to locate an element in a webpage.

It's a list of data properties to extract from the page, and the CSS selectors that correspond to them. You can find more information here.

Why do I need a selector?

Unique selectors are used by monitors to locate parts of interests from the webpage. This is how a monitor is able to extract data, and know when to trigger your webhook.

How to get selectors?

For the most common cases, our Companion browser extension should help you extract selectors just by clicking on elements on the webpage.

If you have more specific requirements, you can leverage the developer tools in Google Chrome to do that. You can get started here.

Can a selector stop working?

Selectors could be non reliable in the following cases:

  • If the website makes heavy use of random IDs and classes

  • If the website changes its structure often

When this happens, your monitor will extract incorrect data, or will stop working if it cannot find selectors anymore. You are able to investigate this issue by editing the monitor and extracting the data again, editing the selectors if necessary.