Use Web Parser to extract data from web pages

If you have an article or blog post that you want to extract data from, you can use Web Parser by Zapier. Web Parser can extract data such as page titles, content, authors, and lead images. Zapier uses Postlight’s parser to parse website data.

Choose your app and event

  • In the Zap editor, click the Action step, or click the plus + icon to add an action to your Zap.
  • Search for and select Web Parser by Zapier.
  • Click the Event dropdown menu and select Parse Webpage.
  • Click Continue.

Set up your web parser action

  • In the URL to parse field, enter the URL of the article you wish to extract content from.
  • Click the Content type dropdown menu and select your content format
    • HTML: formatted text that uses the HTML markup language. This is the default.
    • Markdown: formatted text that uses the Markdown markup language.
    • Plain text: unformatted text.
  • Optional: Click the Continue on failure dropdown menu.
    • If you select True, the Web Parser step will always be a success even if it can’t parse the web page.
    • If you select False, this step will halt if the web page can’t be parsed. If a later step uses data from the Web Parser step, the later step won’t run.

Test your web parser action

If the URL you provided is valid, the step will return data for the following fields if they exist:

  • Title: the name or headline of the page
  • Lead image URL
  • Author: 
  • Content: the main body of the page
  • Date published
  • Dek: the page subheading
  • Next page URL 
  • Excerpt
miscEye icon Note
  • Not all web pages will contain data for each field listed above. If no data was received for a field, the Zap will return a null value.
  • If you see an error, Web Parser was unable to parse the web page.
ratingStar icon Tip

You can add a Formatter step to transform text to other formats. 

Once you’ve finished your Web Parser by Zapier action, you can continue setting up the rest of your Zap.

Was this article helpful?
0 out of 0 found this helpful