The `get_roms` method is used during scanning and to generate feeds.
Sorting by filename is not perfect (e.g. prefixes like "The" or "A"),
but should be good enough for users to better visualize how the scanning
process is going, and how close it's to finish.
At the moment, 7zip files are generating memory issues and even OOM
errors on user installations. This is because the current stable release
of `py7zr` does not support decompression streaming, and RomM needs to
decompress the each 7zip file in the library into memory to be able to
calculate hashes.
This change introduces a `py7zr` fork I created to have a stable commit
SHA to refer to in case upstream gets any forced pushes. It includes the
contents of the pull request the `py7zr` creator is working on to
support decompression streaming [1].
The way decompression streaming is implemented in `py7zr` is different
than the other compression utilities. Instead of being able to provide a
`bytes` iterator, we need to provide a `Py7zIO` implementation that
will call a callback on each read and write operation.
[1] https://github.com/miurahr/py7zr/pull/620
The `tar` decompression function was failing for some users, with error
message:
```
'NoneType' object does not support the context manager protocol
```
As explained in the official documentation [1], the `extractfile` method
returns `None` if the member is not a regular file or a link. This
change skips any member that is not a regular file.
[1] https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractfile
The previous implementation was calling `resize_cover_to_small` within
the context manager that was writing the image to the filesystem. This
was causing `PIL` to raise an error because it could not identify the
open and temporarily created file as a valid image.
Instead of saving the original image to the filesystem and then resizing
it, we now open the image in memory, resize it, and then save it to the
filesystem. We also avoid reading the `BytesIO` object twice by saving
small and big images from the same initial `Image` object.
Fixes#1191.
This change improves memory usage, by only keeping a single archive's
member file in memory at a time during 7zip decompression.
The `py7zr` library does not support streaming decompression yet, so
this change is the best we can do for now.
Potential fix for #1211, but it won't improve memory usage for
single-file 7zip archives.
This change installs and configures the `mod_zip` nginx module [1],
which allows nginx to stream ZIP files directly.
It includes a workaround needed to correctly calculate CRC-32 values for
included files, by including a new `server` section listening at port
8081, only used for the file requests to be upstream subrequests that
correctly trigger the CRC-32 calculation logic.
Also, to be able to provide a `m3u` file generated on the fly, we add a
`/decode` endpoint fully implemented in nginx using NJS, which receives
a `value` URL param, and decodes it using base64. The decoded value is
returned as the response.
That way, the contents of the `m3u` file is base64-encoded, and set as
part of the response, for `mod_zip` to include it in the ZIP file.
[1] https://github.com/evanmiller/mod_zip
Convert `IGDBBaseHandler` methods to be asynchronous, and use an `httpx`
async client, instead of `requests` sync client.
This change also removes the direct dependency with `requests`, as the
project no longer uses it, preferring `httpx` instead.
For filesystem resource handler, `requests` calls have been replaced
with `httpx`, and file I/O has been replaced with `anyio` utils.
The existing approach to save covers and screenshots, by calling
`shutil.copyfileobj` with the raw response is no longer needed. `httpx`
does not provide a file-like object when streaming [1], so there's no
easy drop-in replacement.
However, the applied solution correctly builds the file iteratively, by
consuming the response in chunks.
[1] https://github.com/encode/httpx/discussions/2296
`os.walk` is a generator that can iteratively navigate from the
specified path, top-bottom. However, most of the calls to `os.walk` in
the project cast the call to `list()`, which makes it traverse the path
and recursively find all nested directories.
This is commonly not needed, as we end up just using a `[0]` index to
only access the root path.
This change adds a few utils that simplifies listing files/directories,
and by default does it non-recursively. Performance gains shouldn't be
noticeable in systems with high-speed storage, but we can avoid the edge
cases of users having too many nested directories, by avoiding unneeded
I/O.