Skip to content

ServerlessReact.dev

Student Login
  • Welcome to the workshop
Before the workshop
The workshop
Before you head out

Chrome Puppeteer

Chrome Puppeteer lets you run and control a browser in the cloud. You can do anything a user can do manually.

Great for taking screenshots, scraping data, and manual testing.

The Chrome Puppeteer Documentation gives you full API docs for what's possible.

Load a website

Start with a lambda that runs chrome and loads a website.

Exercise

Move into exercise-4 from the serverless-workshop-exercises GitHub repository.

I've preconfigured it with necessary dependencies:

Tell serverless to avoid packaging the entire browser.

# serverless.yml
package:
exclude:
- node_modules/puppeteer/.local-chromium/**

Add your screenshot function to serverless.yml. Use a GET.

Make sure to specify a large memorysize: (2536 is good) and long timeout: (30 is max). Gives Chrome room to breathe :)

Use the getChrome() method from src/utils to instantiate your browser.

Open a new tab and load a page:

const page = await browser.newPage()
await page.goto(<your url>, {
waitUntil: ["domcontentloaded", "networkidle2"],
})

Grab the first H1 element and return its value.

const h1value = await page.$eval("h1", (el) => el.innerHTML);

Try getting the URL from query params :)

Try your function

Leave payload empty for GET requests, valid JSON for POST.

Solution

https://github.com/Swizec/serverless-workshop-exercises/commit/0264e4efdc6a647c5690bb23be91f7c27a012a87

Take a screenshot

A fun way to use Puppeteer is taking screenshots.

Exercise

Tell your API Gateway it's okay to serve binary files.

# serverless.yml
provider:
# ...
apiGateway:
binaryMediaTypes:
- "*/*"

Get the first H1 element again and measure its size.

Screenshots work on pixels, not the DOM. If you pick a large element like body you might run into problems with screenshots being too large for Puppeteer to handle.

const element = await page.$("h1")
const boundingBox = await element.boundingBox()

Take a screenshot with:

const imagePath = `/tmp/screenshot-${new Date().getTime()}.png`
await page.screenshot({
path: imagePath,
clip: boundingBox,
})
const data = fs.readFileSync(imagePath).toString("base64")

Serve it back from your lambda with correct image headers and content encoding

return {
statusCode: 200,
headers: {
"Content-Type": "image/png",
},
body: data,
isBase64Encoded: true,
}

Try this one in the browser :)

Solution

https://github.com/Swizec/serverless-workshop-exercises/commit/15c27d6f491ef0ba6b2015d12085a35b0dacd713

Did you enjoy this chapter?

Previous:
3rd party API
Next:
Save file to S3
Created bySwizecwith ❤️