PDF Images Remover
Simple web application which removes images from PDF files.
Web app running at https://www.milanlaslop.dev/app/pdf.
Implementation Details and Challenges
Files Loading and Saving
The whole application including files processing runs in the browser. There is no server-side code.
The PDF files are not sent to any server, they are:
Files Processing in Browser (using C++)
The processing is done in C++ (mainly using C++ 11 regex standard library). Regular expressions are used to conveniently find parts of the PDF file (I do not parse and analyze the whole PDF file - I only look for patterns which lead to image objects to remove).
Since the PDF file processing can take long, I do all this work in a Web Worker. Then, the work is done in a separate thread, not freezing the whole page (and sometimes the whole browser). Moreover, the processing can be conveniently canceled at any time.