Chrome Puppeteer
Chrome Puppeteer lets you run and control a browser in the cloud. You can do anything a user can do manually.
Great for taking screenshots, scraping data, and manual testing.
The Chrome Puppeteer Documentation gives you full API docs for what's possible.
Load a website
Start with a lambda that runs chrome and loads a website.
Exercise
Move into exercise-4
from the serverless-workshop-exercises GitHub repository.
I've preconfigured it with necessary dependencies:
- aws-lambda
- chrome-aws-lambda
- puppeteer@3.1.0
Tell serverless to avoid packaging the entire browser.
# serverless.ymlpackage:exclude:- node_modules/puppeteer/.local-chromium/**
Add your screenshot function to serverless.yml
. Use a GET.
Make sure to specify a large memorysize:
(2536 is good) and long timeout:
(30 is max). Gives Chrome room to breathe :)
Use the getChrome()
method from src/utils
to instantiate your browser.
Open a new tab and load a page:
const page = await browser.newPage()await page.goto(<your url>, {waitUntil: ["domcontentloaded", "networkidle2"],})
Grab the first H1
element and return its value.
const h1value = await page.$eval("h1", (el) => el.innerHTML);
Try getting the URL from query params :)
Try your function
Leave payload empty for GET requests, valid JSON for POST.
Solution
Take a screenshot
A fun way to use Puppeteer is taking screenshots.
Exercise
Tell your API Gateway it's okay to serve binary files.
# serverless.ymlprovider:# ...apiGateway:binaryMediaTypes:- "*/*"
Get the first H1 element again and measure its size.
Screenshots work on pixels, not the DOM. If you pick a large element like body
you might run into problems with screenshots being too large for Puppeteer to handle.
const element = await page.$("h1")const boundingBox = await element.boundingBox()
Take a screenshot with:
const imagePath = `/tmp/screenshot-${new Date().getTime()}.png`await page.screenshot({path: imagePath,clip: boundingBox,})const data = fs.readFileSync(imagePath).toString("base64")
Serve it back from your lambda with correct image headers and content encoding
return {statusCode: 200,headers: {"Content-Type": "image/png",},body: data,isBase64Encoded: true,}
Try this one in the browser :)