Hacker Newsnew | past | comments | ask | show | jobs | submit | hyzyla's commentslogin

The same idea with built in Internet Explorer in Microsoft Edge, where you can switch to Internet Explorer mode and open website that only correctly works in Internet Exlorer


Thanks for sharing your story! My goal was to have MVP as fast as possible; otherwise, I could lose interest in it. It is the biggest reason why I chose to use an existing parser instead of writing my own (I've initialized an empty Rust project on my OS for that )

Few things that I have in nice-to-do features list, but hard to implement without writing my own parser: - edit nodes (with XREF table update) - raw source editor - show actual position in source


For editing, I was able to make some simple edits (not of individual objects, but things like removing or duplicating pages, or editing crop boxes) using pdf-lib instead of pdf.js: see for example (just right-click and "view source") https://shreevatsa.net/pdf-pages/ and https://shreevatsa.net/pdf-unspread/

For seeing the raw source, after using such things for a bit (e.g. the output HTML file generated by https://github.com/desgeeko/pdfsyntax which is very good), I'm starting to feel it's nice to look at the first few times / in some cases, but in the long run / for large PDFs, maybe it's not really so useful or worth it.


I took inspiration from RUPS by iText [1]. It’s maybe most popular tool for inspecting PDF files

1. https://github.com/itext/i7j-rups


Thanks for the idea. I added this feature a few minutes ago [1]. I'm trying to convert stream content to UTF-8. Maybe later, I will add a more flexible solution to convert to other encodings.

1. https://imgur.com/a/69rbYMw


This works perfectly now on my test files. I can find the pages that have specific strings I was looking for.

This has saved me clogging up my PC by installing the Java runtime and iText RUPS.


My bad, I meant UTF-8. My brain had a relapse to 1988.

And, thank you!


Uploaded PDF is fully processed inside a browser. I use Posthog analytics, so there are definitely requests to analytics sever


Yeah, definitely it going to my todo list


I haven't decided if I want to create an open-source version. In the first place, I made it private to worry less about my code quality and to finish the product faster before I lose interest in it.

It heavily relies on the core part of PDF.js: I've made a fork of the PDF.js project, removed everything not related to the core part, and added an export for low-level primitives [1].

Also, as inspiration, I used the pdf.js.utils [2] project, which almost does the same but in a different form.

1. https://github.com/hyzyla/pdf.js-core

2. https://github.com/brendandahl/pdf.js.utils


Very nice work.

I wouldn't worry about the quality of the code. You get better by seeing other people's work and seeing alternative solutions to the problems you had.

Also, as I mentioned in another comment, this could easily be built into a quick trouble-checking app for POD work. Posting it would also let people fork it to make more task-specific apps.


I want to be honest and open here: I did not write the PDF parser on my own. I heavily relied on the PDF.js project from Mozilla. I have a disclosure in the footer, but perhaps I should communicate about it more clearly.


No, it was clear already from the webpage that it uses PDF.js. I've also used it. I just think this is a really great way of visualizing PDF's, I shared it with my team as we deal with them a lot.


Thanks! I've fixed the typo and also allowed the selection of the node's text on the left panel (it was disabled by default).


It's one of the things, that I would like government to regulate in some way. Sometime, loosing account looks like destroying someone's job or business by some random corporation without even explain or ability to dispute.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: