Our Guidelines

Thanks for your interest in blogging for the MIT Africans. The MIT Africans medium blog encourages MIT African students, alumni, and affiliates to share their passions, opinions, learnings, hardships…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




More than you want to know about headless Chrome

Typically I tend to watch events such as these from a distance, but as fate would have it I’ve been swallowed by a of GitHub and Google.

Target with JavaScript disabled :(

My 15 minutes of fame

This then caught the attention of the fine folks at graph.cool, who reached out to see if I’d care to help them with a project called chromeless since I was gaining some traction. After some internal moral dilemma I decided to join forces with graph.cool’s chromeless project and begin the process of deprecating Navalia (it’s ok, it lived a long life in JavaScript years). I strongly feel that we should group together to make one amazing project versus three or four mediocre ones. Of course, as JavaScript would have it, Google then came out with their puppeteer project. With an API almost identical to chromeless and navalia, we now had a new contender in the headless library arms’ race.

This gets us to where we are now: two libraries and a few ways to execute them in cloud infrastructure. Let’s take a closer look at both libraries and their distinguishing factors.

Chromeless, as some might not be familiar with, is not only a rich API for driving headless Chrome, but comes with a prescription for how to execute headless work in a production/CI environment. Their take is fascinating: instead of running and managing the binary in your own infrastructure, just do it in AWS lambda.

chromeless does, in my opinion, have a really elegant API.

To summarize into bullet-points for all you skimmers out there:

Pros

Cons

Puppeteer’s API

Puppeteer’s playground

For our skimming friend, heres’ the skinny on puppeteer:

Pros

Cons

I hope this has given you some guidance on library decisions, because things are about to get a lot more complicated when go to ship our code into a production or continuous-integration environment 😢

To get Chrome running on these types of constraints you’ll first have to do the following, or rely on someone else to do it:

And there’s still a few more steps that I won’t waste time here elaborating on, especially since they’re well documented in the following places:

If scale and planning is something that you’re not certain of, then the AWS approach is a great one. You can just as easily run 1 invocation up to 1,000 without much fuss or change.

Not forgetting our skimmers, here’s the bullets y’all are craving:

Pros:

Cons:

Example dockerfile install of Chrome

The nice thing about running your own docker container is that you’re free to use whatever hosting provider (provided they allow or use docker) and can scale to the load you need. Of course you lose out on all the other perks that lambdas provide, namely the auto-scale feature, which is a tricky thing to do in standard cloud providers as you’ll have to load-balance not only http request but web socket connections as well. The docker approach is also perilous as you’ll still run into missing fonts and other drawbacks.

I want emojis 🔥

Never realized there were so many square box emojis…

Pros:

Cons:

Looking back on my little gift registry app, neither Docker nor AWS lambdas satisfied my requirements, as I needed features they just didn’t have. Emoji’s were a must-have, clean isolation was a must as anyone could be using this, and I didn’t want to spend all my time maintaining Chrome in a cloud provider. This is what birthed browserless into the world.

Browserless really sits on top of the docker-way of doing things, but offers some other features as well. It watches Chrome and reboots it when it becomes sluggish, has good support for a variety of languages and emoji’s, and it works with just about any library out there:

🎉 Emojis!

Remember that puppeteer picture? This is what it should have looked like.

This isn’t to say that it doesn’t have its drawbacks. For one it likely costs a bit more than running a docker container yourself. If you’re in tight spot of “I have no idea how much scale I need” then it’s likely not for you as well. There’s still challenges debugging browser-jobs in remote locations, but those are generally shared amongst all providers.

Pros:

Cons:

Even though I somewhat stumbled into this part of web development, I’m extremely excited by the fruits of labor thus far and look forward to the road ahead. I think there’s still a great deal of knowledge that we’ll have to spread in order to keep the best-practices up to date and moving forward. To that end, I’m excited to announce that I’ll be putting together a website that captures best-practices, cool ideas and recipes for these new libraries, and all the updates in the headless arena. Keep your eye out for its reveal soon.

Finally, I welcome your thoughts, feedback, and comments on any of the above. Let me know if I’m gravely mistaken or if there’s a concern you have with headless browsers that haven’t been met. Until then, I’ll see you on the internet!

Add a comment

Related posts:

Saving Jesus

Is there anything left to save? Historically there seems to be less and less. It’s hard to support claims of some sort of divine being in the Mediterranean world at the peak of the early Roman Empire…