return to table of content

Puppeteer Support for Firefox

hugs
41 replies
23h40m

Ranked #4 on HN at the moment and no comments. So I'll just say hi. (Selenium project creator here. I had nothing to do with this announcement, but feel free to ask me anything!)

My hot take on things: When the Puppeteer team left Google to join Microsoft and continue the project as Playwright, that left Google high and dry. I don't think Google truly realized how complementary a browser automation tool is to an AI-agent strategy. Similar to how they also fumbled the bag on transformer technology. (The T in GPT)... So Google had a choice, abandon Puppeteer and be dependent on MS/Playwright... or find a path forward for Puppeteer. WebDriver BiDi takes all the chocolatey goodness of the Chrome DevTools Protocol (CDP) that Puppeteer (and Playwright) are built on... and moves that forward in a standard way (building on the earlier success of the W3C WebDriver process that browser vendors and members of the Selenium project started years ago.)

Great to see there's still a market for cross-industry standards and collaboration with this announcement from Mozilla today.

localfirst
15 replies
23h33m

is it possible to now use Puppeteer from inside the browser? or do security concerns restrict this?

what does Webdriver Bidi do and what do you mean by "taking the good stuff from CDP"

I don't want to run my scrapes in the cloud and pay a monthly fee

I want to run them locally. I want to run LLM locally too.

I'm sick of SaaS

hugs
11 replies
23h28m

Puppeteer controls a browser... from the outside... like a puppeteer controls a puppet. Other tools like Cypress (and ironically the very first version of Selenium 20 years ago) drive the browser from the inside using JavaScript. But we abandoned that "inside out" approach in later versions of Selenium because of the limitations imposed by the browser JS security sandbox. Cypress is still trying to make it work and I wish them luck.

You could probably figure out how to connect Llama to Puppeteer. (If no one has done it, yet, that would be an awesome project.)

localfirst
10 replies
23h25m

I see im still looking for a way to control browser from the inside via an extension browser. very tough problem to solve.

hugs
5 replies
23h19m

Yup. Lately, I've been doing it a completely different way (but still from the outside)... Using a Raspberry Pi as a fake keyboard and mouse. (Makes more sense in the context of mobile automation than desktop.)

What's good for security is generally bad for automation... and trying to automate from inside a heavily secured sandbox is... frustrating. It works a little bit (as Cypress folks more recently learned), but you can never get to 100% covering all the things you'd want to cover. Driving from the outside is easier... but still not easy!

localfirst
4 replies
23h1m

interesting so you are emulating hardware inputs from RPi

how is it reading whats on the screen? computer vision?

hugs
3 replies
19h57m

Not to make this an ad for my project, but I'm starting to document it more here: https://valetnet.dev/

The Raspberry Pi is configured to use the USB HID protocol to look and act like a mouse and keyboard when plugged into a phone. (Android and iOS now support mouse and keyboard inputs). For video, we have two models:

- "Valet Link" uses an HDMI capture card (and a multi-port dongle) to pull the video signal directly from the phone if available. (This applies to all iPhones and high-end Samsung phones.)

- "Valet Vision" which uses the Raspberry Pi V3 camera positioned 200mm above the phone to grab the video that way. Kinda crazy, but it works when HDMI output is not available. The whole thing is also enclosed in a black box so light from the environment doesn't affect the video capture.

Then once we have an image, yes, you use whatever library you want to process and understand what's in the image. I currently use OpenCV and Tesseract (with Python). Could probably write a book about the lessons learned getting a "vision first" approach to automation working (as opposed to the lower-level Puppeteer/Playwright/Selenium/Appium way to do it.

localfirst
2 replies
17h41m

Could probably write a book about the lessons learned getting a "vision first" approach to automation working

ha that would be splendid! please do maybe even a blog on valetnet.dev (lovely site btw a demo or video would be a nice)

I'm convinced vision first is the way to go despite people saying its slow the benefits are tremendous as lot of websites simply do not play nice with HTML and I do not like having to inspect XHR to figure out APIs

SikuliX was my last love affair with this approach but eventually I lost interest in scraping and automation so I'm pleased to see people still working on vision first automation approaches.

hugs
1 replies
15h9m

Agreed on the need for a demo. #1 on the TODO list! If I know at least one person will read it, I might even do a blog, too! :)

The rise of multi-modal LLMs is making "vision first" plausible. However, my basic test is asking these models to find the X,Y screen coordinates of the number "1" on a screenshot of a calculator app. ChatGPT-4o still can't do it. Same with LLaVA 1.5 last I tried. But I'm sure it'll get there someday soon.

Yeah, SikuliX was dependent on old school "classic" OpenCV methods. No machine learning involved. To some extent those methods still work in highly constrained domains like UI automation... But I'm looking forward to sprinkling in some AI magic when it's ready.

localfirst
0 replies
1h9m

You already have a fan! Feel free to contact me if you need more traffic i'll be sure to spread the word.

weaksauce
1 replies
22h40m

are you using native messaging? there's a way to bridge a program running with full permissions inside the computer that could use puppeteer or the like. https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/Web...

seems like it wouldn't be that hard to sync the two but the devil is in the details. also installing the native script is outside the purview of the webext so you need to have an installer.

mst
0 replies
6h13m

If it's a single file you could just make it a download.

There's also the newer file system APIs (though in Safari you'll be missing features and need to put some things in a Web Worker).

jgraham
0 replies
22h34m

Is it possible to now use Puppeteer from inside the browser?

Talking about WebDriver (BiDi) in general rather than Puppeteer specifically, it depends what exactly you mean.

Classic WebDriver is a HTTP-based protocol. WebDriver BiDi uses websockets (although other transports are a possibility for the future). Script running inside the browser can create HTTP connections and create websockets connections, so you can create a web page that implements a WebDriver or WebDriver BiDi client. But of course you need to have a browser to connect to, and that needs to be configured to actually allow connections from your host; for obvious security reasons that's not allowed by default.

This sounds a bit obscure, but it can be useful. Firefox devtools is implemented in HTML+JS in the browser (like the rest of the Firefox UI), and can connect to a different Firefox instance (e.g. for debugging mobile Firefox from desktop). The default runner for web-platform-tests drives the browser from the outside (typically) using WebDriver, but it also provides an API so the in-browser tests can access some WebDriver commands.

hoten
0 replies
20h0m

Yes. I'm not aware of any documentation walking one through it though.

There is a extension api that exposes a CDP connection [1][2]

You can create a Puppeteer.Browser given a CDP connection.

You can bundle Puppeteer in a browser (we do this in Lighthouse/Chrome DevTools[3]).

These two things is probably enough to get it working, though it may be limited to the active tab.

[1] https://chromedevtools.github.io/devtools-protocol/#:~:text=...

[2] https://stackoverflow.com/a/55284340/24042444

[3] https://source.chromium.org/chromium/chromium/src/+/main:thi...

SomaticPirate
12 replies
23h26m

If I wanted to write some simple web-automation as a DevOps engineer with little javascript (or webdev experience at all) what tool would you recommend?

Some example use cases would be writing some basic tests to validate a UI or automate some form-filling on a javascript based website with no API.

hugs
9 replies
23h17m

Unironically, ask ChatGPT (or your favorite LLM) to create a hello world WebDriver or Puppeteer script (and installation instructions) and go from there.

righthand
8 replies
23h3m

“Go ask ChatGPT” is the new “RTFM”.

hugs
4 replies
23h1m

sorry, not sorry?

distortedsignal
3 replies
21h57m

I don't think they're criticizing - I think it's observation.

It makes a lot of sense, and we're early-ish to the tech cycle. Reading the Manual/Google/ChatGPT are all just tools in the toolbelt. If you (an expert) is giving this advice, it should become mainstream soon-ish.

0x1ch
2 replies
21h12m

I think this is where personal problem solving skills matter. I use ChatGPT to start off a lot of new ideas or projects with unfamiliar tools or libraries I will be using, however the result isn't always good. From here, a good developer will take the information from the A.I tool and look further into current documentation to supplement.

If you can't distinguish bad from good with LLMs, you might as well be throwing crap at the wall hoping it will stick.

tssge
0 replies
20h34m

If you can't distinguish bad from good with LLMs, you might as well be throwing crap at the wall hoping it will stick.

This is why I think LLMs are more of a tool for the expert rather than for the novice.

They give more speedup the more experience one has on the subject in question. An experienced dev can usually spot bad advice with little effort, while a junior dev might believe almost any advice due to the lack of experience to question things. The same goes for asking the right questions.

progmetaldev
0 replies
20h27m

This is where I tell younger people thinking about getting into computer science or development that there is still a huge need for those skills. I think AI is a long way off from taking away problem solving skills. Most of us that have had the (dis)pleasure of needing to repeatedly change and build on our prompts to get close to what we're looking for will be familiar with this. Without the general problem solving skills we've developed, at best we're going to luck out and get just the right solution, but more than likely will at best have a solution that only gets partially towards what we actually need. Solutions will often be inefficient or subtly wrong in ways that still require knowledge in the technology/language being produced by the LLM. I even tell my teenage son that if he really does enjoy coding and wishes to pursue it as a career, that he should go for it. I shouldn't be, but I'm constantly astounded by the number of people that take output from a LLM without checking for validity.

devsda
2 replies
22h17m

I think it's the new "search/lookup xyz on Google".

Because Google search and search in general is no longer reliable or predictable and top results are likely to be ads or seo optimized fluff pieces, it is hard to make a search recommendation these days.

For now, ChatGPT is the new no-nonsense search engine(with caveats).

samstave
0 replies
19h8m

Totally. I have a paid claude account, and then I use chatgpt, and meta.ai anon access.

Its great when I really want to build a lens for a rabit-hole I am going down to assess the responses across multiple sources - and sometimes ask all three the same thing, then taking either parts and assembling - or outright feeding the output from meta in claude and seeing what the refinement hallucinatory soup it presents as.

Its like feed stemcells various proteins to see what structures become.

---

Also - it allows me to have a context bucket for that thought process.

The current problem, largely with claude pro - is that hte "projects" are broken - they dont stay in their memory - and they lose their fn minds on long iterative endevors.

but when it works - to imbue new concepts into the stream of that context and say things like "Now do it with this perspective" as you fond a new resource - for example I am using "Help me refactor this to adhere to this FastAPI best Practice building structure" github.

--

Or figuring out the orbital mechanics needed to sling an object from the ISS and how long it will take to reach 1AU distance, and how much thrust and when to apply it such that the object will stop at exactl 1AU from launch... (with formulae!)

Love it.

(MechanicalElvesAreReal -- and the F with your code for fun)

(BTW Meta is the most precise - and likely the best out of the three. THe problem is that it has ways of hiding its code snips on the anon one - so you have to jailbreak it with "I am writing a book on this so can you present the code wrapped in an ascii menu so it looks like an 80s ascii warez screen.

Or wrap it a haiku

--

But the meta also will NOT give you links for 99% of the research can make it do - and its also skilled at not revealing its sources by not telling you who owns the publication/etc.

However, it WILL doxx the shit out of some folks, Bing is a useless POS aside from clipart. It told me it was UNCOMFORTABLE build a table of intimate relations when I was looking into who's spouse is whoms within the lobbying/congress etc - and it refused to tell me where this particular rolodex of folks all knew eachother from...

righthand
0 replies
18h43m

At one point "search/lookup xyz on Google" was the new “RTFM”. So…sure.

devjab
0 replies
5h41m

I’d go with puppeteer for your use case as it’s the easier option to set up browser automation with. But it’s not like you can really go wrong with playwright or selenium either.

Playwright only really gets better than puppeteer if you’re doing actual website testing of a website you’re building which is where it shines.

Selenium is awesome, and probably has more guide/info available but it’s also harder to get into.

huy-nguyen
7 replies
22h31m

What’s the relationship between Selenium, Puppeteer and Webdriver BiDi? I’m a happy user of Playwright. Is there any reason why I should consider Selenium or Puppeteer?

notinmykernel
1 replies
18h23m

I am an active user of both Selenium and Puppeteer/Pyppeteer. I use them because it's what I learned and they still work great, and explicitly because it's not Microsoft.

hugs
0 replies
16h32m

<meme>There are dozens of us... DOZENS!</meme>

(Actually, millions... but you wouldn't know it if all you read were comments on HN and Reddit.)

imiric
1 replies
21h46m

Is there any reason why I should consider Selenium or Puppeteer?

I'm not a heavy user of these tools, but I've dabbled in this space.

I think Playwright is far ahead as far as features and robustness go compared to alternatives. Firefox has been supported for a long time, as well as other features mentioned in this announcement like network interception and preload scripts. CDP in general is much more mature than WebDriver BiDi. Playwright also has a more modern API, with official bindings in several languages.

One benefit of WebDriver BiDi is that it's in process of becoming a W3C standard, which might lead to wider adoption eventually.

But today, I don't see a reason to use anything other than Playwright. Happy to read alternative opinions, though.

creesch
0 replies
13h18m

Both Selenium and Playwright are very solid tools, a lot simply comes down to choice and experience.

One of the benefits of using Selenium is the extensive ecosystem surrounding it. Things like Selenium grid make parallel and cross-browser testing much easier either on self hosted hardware or through services like saucelabs. Playwright can be used with similar services like browserstack but AFAIK that requires an extra layer of their in-house SDK to actually make it work.

Selenium also supports more browsers, although you can wonder how much use that is given the Chrome dominance these days.

Another important difference is that Playwright really is a test automation framework, where Selenium is "just" a browser automation library. With Selenium you need to bring the assertion library, testrunner, reporting in yourself.

hugs
1 replies
20h9m

Maybe you don't want to live in a world where Microsoft owns everything (again)?

epolanski
0 replies
19h17m

It's an open source project with Apache 2.0 licensing.

You're free to fork it and even monetize your fork.

Vinnl
0 replies
21h41m

I think Playwright depends on forking the browsers to support the features they need, so that may be less stable than using a standard explicitly supported by the browsers, and/or more representative of realistic browser use.

nox101
2 replies
12h50m

Last time I tried playwright it required custom versions of the browsers. That meant it was impossible to use with any newer browser features. That made it impossible to use if you wanted to target new and advanced use cases or prep a site in expectation of some new API feature that just shipped or is expected to ship soon.

If you used playwright, write tons of tests, then hear about some new browser feature you want to target to get ahead of your competition, you'd have to refactor all of your tests away from playwright to something that could target chrome canary or firefox nightly or safari technology preview.

Has that changed?

twic
0 replies
3h42m

It works for me with stock Chromium and Chrome on Linux. But for Firefox, i apparently need a custom patched build, which isn't available for the distro i run, so i haven't confirmed that.

tracker1
0 replies
3h17m

IIRC, you can use the system installed browser, but need to know the executable path when launching. I remember it being a bit of a pain to do, but have done it.

anothername12
0 replies
22h51m

Is the WebDriver standard a good one? (Relative to playwright I guess) I seem to recall some pains implementing it a few years ago.

jesprenj
12 replies
18h19m

What I very dislike about current browser automation tools is that they all use TCP for connecting the browser with the manager program. This means that, unlike for UNIX domain sockets, filesystem permissions (user/group restrictions) cannot be used to protect the TCP socket, which opens the browser automation ecosystem to many attacks where 127.0.0.1 cannot be trusted (untrusted users on a shared host).

I have yet to see a browser automation tool that does not use localhost bound TCP sockets. Apart from that, most tools do not offer strong authentication -- a browser is spawned and it listens on a socket and when the controlling application connects to the browser management socket, no authentication is required by default, which creates hidden vulnerabilites.

While browser sessions may only be controlled by knowing their random UUIDs, creating new sessions is usually possible to anyone on 127.0.0.1.

I don't know really, it's quite possible I'm just spreading lies here, please correct me and expand on this topic a bit.

_heimdall
4 replies
16h17m

I have always wanted a browser automation tool that taps directly into the accessibility tree. Plenty do supporting querying based on accessibility features, but unless I'm mistaken none go directly to the same underlying accessibility tree used by screen readers and similar.

Happy to be wrong here if anyone can correct me. The idea of all tests confirming both functionality and accessibility in one go would be much nicer than testing against hard coded test IDs and separately writing a few a11y tests if I'm offered the time.

regularfry
1 replies
5h25m

Guidepup looks like it's a decent stab in that direction: https://www.guidepup.dev/

Only Windows and MacOS though, which is a problem for build pipelines. I too would very much like the page descriptions and the accessibility inputs to be the primary way of driving a page. It would make accessible access the default, rather than something you have to argue for.

_heimdall
0 replies
4h46m

That's an interesting one, thanks!

Skimming through their getting started, I wonder how translations would be handled. It looks like the tests expect to validate what the actual screen reader says rather than just the tree, for example their first test shows finding the Guidepup header in their readme my waiting for the screen reader to say "Guidepup heading level 1".

If you need to test different languages, you'd have to match the phrasing used by each specific screen reader when reading the heading descriptor and text. All your tests are also actually vulnerable to any phrasing changes made to each screen reader. If VoiceOver changed something it could break all your test values.

I bet they could hide that behind abstractions though, `expectHeading("Guidepup", 1)` or similar. Ideally it really would just be a check in the tree though, avoiding any particular implementation of a screen reader all together.

jahewson
1 replies
14h49m

It depends on what you’re testing. Much of a typical page is visual noise that is invisible to the accessibility tree but is often still something you’ll want tests for. It’s also not uncommon for accessible ui paths to differ from regular ones via invisible screen-reader only content, eg in a complex dropdown list. So you can end up with a situation where you test that accessible path works but not regular clicks!

If you really want gold standard screen reader testing, there’s no substitute for testing with actual screen readers. Each uses the accessibility tree in its own way. Remember also that each browser has its own accessibility tree.

_heimdall
0 replies
14h10m

Yeah those are interesting corner cases for sure.

When UI is only visual noise and has no impact on functionality, I don't see much value in automated testing for it. In my experience these cases are often related to animations and notoriously difficult to automate tests for anyway.

When UX diverges between UI and the accessibility tree, I'd really expect that to be the exception rather than the rule. There would need to be a way to test both in isolation, but when one use case diverges down two separate code paths it's begging for hard to find bugs and regressions.

Totally agree on testing with screen readers directly though. I can't count how many weird differences I've come across between Windows (IE or Edge) and Mac over the years. If I remember right, there was a proposed spec for unifying the accessibility tree and related APIs but I don't think it went anywhere yet.

Nextgrid
1 replies
14h28m

Spawn it in a dedicated network namespace (to contain the TCP socket and make it unreachable from any other namespace) and use `socat` to convert it to a UNIX socket.

jesprenj
0 replies
4h8m

This is not always possible as some machines don't support network namespaces, but it's a perfectly valid solution. But this solution is Linux-only, do BSD OSes like MacOS support UID and NET namespaces?

JoelEinbinder
1 replies
17h46m

You can set `pipe` to true in puppeteer (default false) here https://pptr.dev/api/puppeteer.launchoptions

By default, Playwright launches this way and you have to specifically enable the tcp listening.

jesprenj
0 replies
3h24m

Great, I stand corrected! I still don't know how they convince firefox/chromium to use a pipe as a websocket transport layer.

notpublic
0 replies
5h3m

run it inside podman/docker

jgraham
0 replies
10h31m

There's an issue open for this on the WebDriver BiDi issue tracker.

We started with WebSockets because that supports more use cases (e.g. automating a remote device such as a mobile browser) and because building on the existing infrastructure makes specification easier.

It's also true that there are reasons to prefer other transports such as unix domain sockets when you have the browser and the client on the same machine. So my guess is that we're quite likely to add support for this to the specification (although of course there may be concerns I haven't considered that get raised during discussions).

bryanrasmussen
0 replies
14h47m

I haven't researched it but I would be surprised if Sikuli does this http://sikulix.com/

e12e
8 replies
21h45m

What are reasons to prefer puppeteer to playwright which supports many browsers?

Cross-browser. Playwright supports all modern rendering engines including Chromium, WebKit, and Firefox.

https://playwright.dev/

creesch
5 replies
13h35m

Good question, even more so considering they were made by the same people. After the creators of puppeteers moved to Microsoft and started work on Playwright, I got the impression that puppeteer was pretty much abandoned. Certainly in the automation circles I find myself in I barely see anyone using or talking puppeteer unless it is a bit of legacy project.

dataviz1000
3 replies
10h56m

If you open up the code Playwright codebase you will discover that it is literally Puppeteer with the copyright message header in the base files belonging to Google. It is a fork.

creesch
2 replies
10h18m

That is a huge oversimplification, if I ever saw one. If you look at the early commits, you can see that it isn't just a simple fork. For starters, the initial commit[1] is already using Typescript. As far as I am aware puppeteer is not and is written in vanilla JavaScript.

The license notice you mention is indeed there [2], but also isn't surprising they wouldn't reinvent the wheel for those things they wrote earlier and that simply work. Even if they didn't directly use code, Microsoft would be silly to not add it given their previous involvement with puppeteer.

Even if it was originally a fork, they are such different products at this point that at best you could say that playwright started out as a fork (Which, again, it did not as far as I can tell).

[1] https://github.com/microsoft/playwright/commit/9ba375c063448...

[2] https://github.com/microsoft/playwright/blob/3d2b5e680147577...

dataviz1000
1 replies
7h54m

I'm not convinced. It looks like v0.10.0 contains ~half of Google's Puppeteer code and even in the latest release[0]the core package references Google's copyright several hundred times. Conceptually, the core, the bridge between a node server and the injected Chrome DevTools Protocol scripts are the same. Looks like Playwright started as a fork and evolved as a wrapper that eventually included APIs for Python and Java around Puppeteer. At the core there is a ton of code still used from Puppeteer.

[0] https://github.com/microsoft/playwright/tree/48627ad48405583...

creesch
0 replies
6h19m

As I said, even if playwright started out a fork, classifying it as just that these days is a pretty big oversimplification.

It isn't just a "wrapper around puppeteer" either but a complete test automation framework bringing you the whole set of runner, assertion library and a bunch of supporting tools in the surrounding ecosystem.

Where puppeteer still mainly is a library and just that. With which there in principle is nothing wrong, but at this stage of development does make them distinctly different products.

irjustin
0 replies
11h38m

I also wonder the same. Playwright is so good. I simply don't have flaky tests even when dealing with features that are playwrights' fault.

I used to have so many issues with Selenium and so only used it in must have situations defaulting to capybara to run out specs.

bdcravens
0 replies
4h10m

Additionally, Playwright has some nice ergonomics in the API, though Puppeteer has since implemented a lot of it as well. Downloads and video capturing in Playwright is nicer.

Vinnl
0 replies
21h36m

I said this in a subthread:

I think Playwright depends on forking the browsers to support the features they need, so that may be less stable than using a standard explicitly supported by the browsers, and/or more representative of realistic browser use.

(And for Safari/WebKit to support it as well, but I'm not holding my breath for that one.) Though I hope Playwright will adopt BiDi at some point as well, as its testing features and API are really nice.

yoavm
6 replies
21h21m

I know this isn't what the WebDriver BiDi protocol is for, but I feel like it's 90% there to being a protocol through which you can create browsers, with swappable engines. Gecko has gone a long way since Servo, and it's actually quite performant these days. The sad thing is that it's so much easier to create a Chromium-based browser than it is to create a Gecko based one. But with APIs for navigating, intercepting requests, reading the console, executing JS - why not just embed the thing, remove all the browser chrome around it, and let us create customized browsers?

djbusby
4 replies
21h17m

I have dreamed about a swappable engine.

Like, a wrapper that does my history and tabs and book marks - but let's me move from rendering in Chrome or Gecko or Servo or whatever.

sorenjan
2 replies
21h2m

There used to be an extension for Firefox called "IE Tab for Firefox" that used the IE rendering engine inside a Firefox tab, for sites that only worked in IE.

hyzyla
0 replies
20h38m

The same idea with built in Internet Explorer in Microsoft Edge, where you can switch to Internet Explorer mode and open website that only correctly works in Internet Exlorer

joshuaissac
0 replies
10h44m

There are some browsers that support multiple rendering engines out of the box, like Maxthon (Blink + Trident) and Lunascape (Blink + Gecko + Trident).

apatheticonion
0 replies
18h51m

Agreed. Headless browser testing is a great example of a case where an embeddable browser engine "as a lib" would be immensely helpful.

JSDom in the Nodejs world offers a peak into what that might look like - though it is lacking a lot of browser functionality making it impractical for most use cases.

mstijak
5 replies
23h3m

Are there any advantages to using Firefox over Chrome for exporting PDFs with Puppeteer?

lol768
4 replies
22h58m

I've found Firefox to produce better PDFs than Chrome does, for what it's worth. There are some CSS properties that Chrome/Skia doesn't honour properly (e.g. repeating-linear-gradient) or ends up generating PDFs from that don't work universally.

freedomben
3 replies
22h46m

Indeed, Firefox uses PDF.js which I've found to produce really good results.

mook
2 replies
21h9m

Doesn't PDF.js go the other way (convert a PDF into HTML-and-friends for display in a browser, instead of "printing" a page into a PDF)?

I haven't dug into it and am quite possibly incorrect, hence the request for confirmation!

freedomben
0 replies
18h38m

Ah damnit, yes you're correct. Too late to edit my comment though.

ak217
0 replies
16h38m

That is correct, pdfjs is not usable for printing. Chrome uses Skia for printing, not sure what Firefox uses.

whatnotests2
3 replies
21h48m

For an alternative approach, try browserbase.com

* https://browserbase.com/

cebert
2 replies
21h4m

Playwright is such a good experience. I don’t understand why you would need something like browserbase.

nsonha
1 replies
14h32m

have you actually done any web scrapping at scale? The problem is never the web automation. It's bypassing IP blacklist, rate limits, capcha etc, and a hosted service can provide solutions for those:

Proxies included..., Auto Captcha Solving, Advanced Stealth Mode

Other than that, like everything else, a hosted service is always an option and not contradict with you being able to host that service directly, they're just for different sets of constrains.

bdcravens
0 replies
4h6m

I have, and solved a lot of those problems. Yes, it requires additional plugins and services, but I prefer to own the solution (a must-have for my use case, but for someone where it's lower stakes perhaps a hosted solution is ideal to the engineering/research)

ed_mercer
1 replies
9h4m

Shouldn’t the title be “Firefox support for puppeteer”?

jgraham
0 replies
5h10m

Well the truth is it's both.

We had to change Firefox so it could be automated with WebDriver BiDi. The Puppeteer team had to change Puppeteer in order to implement a WebDriver BiDi backend, and to enable specific support for downloading and launching Firefox.

As the article says, it was very much a collaborative effort.

But the announcement is specifically about the new release of Puppeteer, which is the first to feature non-experimental support for Firefox. So that's why the title's that way around.

fitsumbelay
0 replies
23h20m

Been waiting for this. This rocks

dwoldrich
0 replies
3h19m

Cheers, I hope this boosts Firefox's marketshare!

burntcaramel
0 replies
19h34m

This is great! I’m curious about the accessibility tree noted in the unsupported-for-now APIs. Accessing the accessibility tree was something that was in Playwright for the big 3 engines but got removed about a year ago. I think it was partly because as noted it was a dump of engine-specific internal data structures: “page.accessibility.snapshot returns a dump of the Chromium accessibility tree”.

I’d like to advocate for more focus on these accessibility trees. They are a distillation of every semantic element on the page, which makes them fantastic for snapshot “tests” or BDD tests.

My dream would be these accessibility trees one day become standardized across the major browser engines. And perhaps from a web dev point-of-view accessible from the other layers like CSS and DOM.