Caution | |
---|---|
The module’s dialog field names that are displayed in bold (in the Boost.spaceCentralization and synchronization platform, where you can organize and manage your data. More IntegratorPart of the Boost.space system, where you can create your connections and automate your processes. More scenarioA specific connection between applications in which data can be transferred. Two types of scenarios: active/inactive. More, not in this documentation article) are mandatory! |
Retrieves the desired elements from an HTML code.
Continue the execution of the route even if the moduleThe module is an application or tool within the Boost.space system. The entire system is built on this concept of modularity. (module - Contacts) More returns no results |
If enabled, the scenario will not be stopped by this module. |
Element type |
Select the type of element you want to retrieve from the HTML code such as image, link, or iframe element(s). |
HTML |
Enter the HTML code you want to retrieve the specified element types from. |
The Match pattern module enables you to find and extract string elements matching a search pattern from a given text. The search pattern is a regular expression (aka regex or regexp), which is a sequence of characters in which each character is either a metacharacter, having a special meaning, or a regular character that has a literal meaning.
-
The complete list of metacharacters can be found on the MDN web docs website.
-
For a tutorial on how to create regular expressions, we recommend the RegexOne website.
-
For an easy, quick regex generator, try the Regular Expressions generator.
-
For experimenting with regular expressions, we recommend the regular expressions 101 website. Just make sure to tick the ECMAScript (JavaScript) FLAVOR in the left panel:
Pattern |
Enter the regular expression pattern. For example,
|
|||
Global match |
If enabled, then the module retrieves all matches in the text. If disabled, then the module retrieves only the first entry. |
|||
Case sensitive |
You can disable the case sensitivity by disabling this option (default=case sensitive). |
|||
Multiline |
If checked, beginning and end metacharacters ( |
|||
Continue the execution of the route even if the module returns no results |
If enabled, the scenario will not be stopped by this module. |
|||
Text |
Enter the text you want to match the pattern. |
Searches the entered text for a specified value or regular expression, and replaces the result with the new value.
Pattern |
Enter the search term. You can also use a regular expression. For more details about the regular expression, refer to the Match Pattern module. |
New value |
Enter a value that will replace the search term. |
Global Match |
If this option is enabled, the module will find all matches rather than stopping after the first match. Each match will be output in a separate bundle. |
Case sensitive |
If this option is enabled, the search will be case sensitive. |
Multiline |
If checked, the beginning and end metacharacters ( |
Text |
Enter the text to be searched. |
Data scraping, sometimes called web scraping, data extraction, or web harvesting is simply the process of collecting data from websites and storing it in your local database or spreadsheets. If you wish to scrape data from a website and you are not familiar with regular expressions, you may use a data scraping tool:
-
Apify is an excellent tool, and we already have it integrated
If the data scraping tool provides a REST API, you can connect to it via our universal HTTP and Webhooks modulesThe module is an application or tool within the Boost.space system. The entire system is built on this concept of modularity. (module - Contacts) More. You can also implement an app on your own using the Boost.space Integrator App SDK.