Manipulate the URL for a higher resolution: https://codex-atlanticus.ambrosiana....

kragen · 2025-10-28T15:31:29 1761665489

Thanks! On my cellphone not even enough of the UI was working for me to discover those URLs. I suspect a certain amount of error recovery is in order for wgetting all 2238 images. 2000 seems to be the maximum resolution available, which is under 100dpi. A few of the images seem to have been uploaded to https://commons.wikimedia.org/wiki/Category:Codex_Atlanticus.

There are a couple of scans of a 43-page Italian edition published by Ulrico Hoepli on the Archive: https://archive.org/details/codex-atlanticus-leonardo-da-vin... https://archive.org/details/codex-atlanticus-leonardo-da-vin... but they seem to be of very poor quality.

I'm done downloading now (with a sleep of 1 second between pages), and I have 1064125470 bytes of JPEG files, a very reasonably torrentable size. I'll see if I can put together a torrent and upload to the Archive and Commons...

WithinReason · 2025-10-28T10:35:58 1761647758

Or in PowerShell on Windows:

  1..1119 | % { iwr "https://codex-atlanticus.ambrosiana.it/assets/2000/000R-$_.jpg" -OutFile "000R-$_.jpg" }
  1..1119 | % { iwr "https://codex-atlanticus.ambrosiana.it/assets/2000/000V-$_.jpg" -OutFile "000V-$_.jpg" }

embedding-shape · 2025-10-28T13:59:55 1761659995

Some people around me swear PowerShell has better user experience than unix shells, but then I keep seeing examples like these. How on earth could people prefer this compared to `wget https://codex-atlanticus.ambrosiana.it/assets/2000/000V-{1.....`?

kragen · 2025-10-28T15:32:08 1761665528

In this case presumably the main difference is not PowerShell vs. bash but iwr vs. wget? Because I think this is roughly equally bad (untested):

    for page in {1..1119}; do
        iwr "https://codex-atlanticus.ambrosiana.it/assets/2000/000R-$page.jpg" -OutFile "000R-$page.jpg"
        iwr "https://codex-atlanticus.ambrosiana.it/assets/2000/000V-$page.jpg" -OutFile "000V-$page.jpg"
    done

Also until recently bash didn't have {42..53} syntax. You had to use `seq`. There was an alternative name for `seq` in Unix Power Tools, `jot`, because it wasn't standard: https://docstore.mik.ua/orelly/unix/upt/ch45_11.htm. This section was by ORA author and sysadmin Linda Mui (https://www.oreilly.com/pub/au/268), but I don't know if she wrote `jot` or just popularized it.

NoMoreNicksLeft · 2025-10-28T06:54:04 1761634444

Any idea on how to best compile it to an ebook? Just stuffing the jpgs into a pdf rarely works well...

foofoo12 · 2025-10-28T10:21:24 1761646884

I usually do what rarely doesn't work well for you, but it works decently for me. You get 1 page per image and the image isn't compressed or touched at all.

  apt install img2pdf
  img2pdf *.jpg -o leonardo-da-book.pdf

nunodonato · 2025-10-28T10:42:06 1761648126

wouldnt this mess up the order? I think you are supposed to view it like R1, V2, R2, V2, etc

foofoo12 · 2025-10-28T10:51:27 1761648687

Yes, this was just an example. Using wildcard expansion will give you whatever order the your current shell seems fit. Bash does alphabetical order.

kragen · 2025-10-28T16:14:22 1761668062

More like

    echo $(for page in {1..1119}; do for side in R V; do
      echo "000$side-$page.jpg"; done; done)

c0balt · 2025-10-28T07:42:55 1761637375

I haven't that done this in some time, but templating some markdown code for pandoc and creating an ebup might be a viable avenue.

kragen · 2025-10-28T16:16:41 1761668201

Maybe what rarely works well for NoMoreNicksLeft is having a gigabyte of JPEGs in a single HTML chapter inside the epub? In that case you could do something like divide the files into 373 "chapters" of 6 pages each?

One of the fragmentary editions I linked on the Archive uses the .cbr Comic Book Reader format; perhaps that is a better format than .epub for high-resolution scans of every page?

NoMoreNicksLeft · 2025-10-28T13:17:10 1761657430

Oooh... I have even less luck with epub, when the pages are an image-per-page.

ticulatedspline · 2025-10-28T18:23:42 1761675822

Easy way would be to just drop them in a zip and label it .cbz. Most readers handle CBR/CBZ just fine.

kragen · 2025-10-28T19:05:21 1761678321

Oh, is .cbz that simple? Does it use the file order of the zipfile members or some other order? (https://acbf.fandom.com/wiki/ACBF_Editor_-_Creating_Metadata says it uses alphabetical order, which is the wrong order in this case.)

It may be useful to use zip -Z store. JPEG data isn't going to get much benefit from another layer of LZ77.

atoav · 2025-10-28T12:32:24 1761654744

Calibre comes with a ebook-convert command, that one might work

eMPee584 · 2025-10-28T09:38:54 1761644334

ocrmypdf (rocks!)