The `get_roms` method is used during scanning and to generate feeds.
Sorting by filename is not perfect (e.g. prefixes like "The" or "A"),
but should be good enough for users to better visualize how the scanning
process is going, and how close it's to finish.
At the moment, 7zip files are generating memory issues and even OOM
errors on user installations. This is because the current stable release
of `py7zr` does not support decompression streaming, and RomM needs to
decompress the each 7zip file in the library into memory to be able to
calculate hashes.
This change introduces a `py7zr` fork I created to have a stable commit
SHA to refer to in case upstream gets any forced pushes. It includes the
contents of the pull request the `py7zr` creator is working on to
support decompression streaming [1].
The way decompression streaming is implemented in `py7zr` is different
than the other compression utilities. Instead of being able to provide a
`bytes` iterator, we need to provide a `Py7zIO` implementation that
will call a callback on each read and write operation.
[1] https://github.com/miurahr/py7zr/pull/620
The `tar` decompression function was failing for some users, with error
message:
```
'NoneType' object does not support the context manager protocol
```
As explained in the official documentation [1], the `extractfile` method
returns `None` if the member is not a regular file or a link. This
change skips any member that is not a regular file.
[1] https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractfile
This change improves memory usage, by only keeping a single archive's
member file in memory at a time during 7zip decompression.
The `py7zr` library does not support streaming decompression yet, so
this change is the best we can do for now.
Potential fix for #1211, but it won't improve memory usage for
single-file 7zip archives.
This change installs and configures the `mod_zip` nginx module [1],
which allows nginx to stream ZIP files directly.
It includes a workaround needed to correctly calculate CRC-32 values for
included files, by including a new `server` section listening at port
8081, only used for the file requests to be upstream subrequests that
correctly trigger the CRC-32 calculation logic.
Also, to be able to provide a `m3u` file generated on the fly, we add a
`/decode` endpoint fully implemented in nginx using NJS, which receives
a `value` URL param, and decodes it using base64. The decoded value is
returned as the response.
That way, the contents of the `m3u` file is base64-encoded, and set as
part of the response, for `mod_zip` to include it in the ZIP file.
[1] https://github.com/evanmiller/mod_zip
`os.walk` is a generator that can iteratively navigate from the
specified path, top-bottom. However, most of the calls to `os.walk` in
the project cast the call to `list()`, which makes it traverse the path
and recursively find all nested directories.
This is commonly not needed, as we end up just using a `[0]` index to
only access the root path.
This change adds a few utils that simplifies listing files/directories,
and by default does it non-recursively. Performance gains shouldn't be
noticeable in systems with high-speed storage, but we can avoid the edge
cases of users having too many nested directories, by avoiding unneeded
I/O.