Commit Graph

28 Commits

Author SHA1 Message Date
Georges-Antoine Assi
3fcce6606c complete updating the endpoints and models 2024-12-20 22:41:56 -05:00
Michael Manganiello
bcaecbd311 misc: Sort roms in get_roms method
The `get_roms` method is used during scanning and to generate feeds.
Sorting by filename is not perfect (e.g. prefixes like "The" or "A"),
but should be good enough for users to better visualize how the scanning
process is going, and how close it's to finish.
2024-12-13 10:01:49 -03:00
Michael Manganiello
477d9b1744 feat: Add streaming support for 7zip hashing
At the moment, 7zip files are generating memory issues and even OOM
errors on user installations. This is because the current stable release
of `py7zr` does not support decompression streaming, and RomM needs to
decompress the each 7zip file in the library into memory to be able to
calculate hashes.

This change introduces a `py7zr` fork I created to have a stable commit
SHA to refer to in case upstream gets any forced pushes. It includes the
contents of the pull request the `py7zr` creator is working on to
support decompression streaming [1].

The way decompression streaming is implemented in `py7zr` is different
than the other compression utilities. Instead of being able to provide a
`bytes` iterator, we need to provide a `Py7zIO` implementation that
will call a callback on each read and write operation.

[1] https://github.com/miurahr/py7zr/pull/620
2024-11-08 21:31:11 -03:00
Michael Manganiello
8fd680ab84 fix: Make tar decompression only consider regular files
The `tar` decompression function was failing for some users, with error
message:

```
'NoneType' object does not support the context manager protocol
```

As explained in the official documentation [1], the `extractfile` method
returns `None` if the member is not a regular file or a link. This
change skips any member that is not a regular file.

[1] https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractfile
2024-10-26 01:07:27 -03:00
Michael Manganiello
149098fb31 fix: Improve memory usage during 7zip decompression
This change improves memory usage, by only keeping a single archive's
member file in memory at a time during 7zip decompression.

The `py7zr` library does not support streaming decompression yet, so
this change is the best we can do for now.

Potential fix for #1211, but it won't improve memory usage for
single-file 7zip archives.
2024-10-06 20:18:49 -03:00
Georges-Antoine Assi
00c8771e22 [ROMM0-1155] Patch zipfil + catch more 7zip errors 2024-09-01 21:58:22 -04:00
Michael Manganiello
0fad8ac282 feat: Use nginx mod_zip to generate multi-file zip downloads
This change installs and configures the `mod_zip` nginx module [1],
which allows nginx to stream ZIP files directly.

It includes a workaround needed to correctly calculate CRC-32 values for
included files, by including a new `server` section listening at port
8081, only used for the file requests to be upstream subrequests that
correctly trigger the CRC-32 calculation logic.

Also, to be able to provide a `m3u` file generated on the fly, we add a
`/decode` endpoint fully implemented in nginx using NJS, which receives
a `value` URL param, and decodes it using base64. The decoded value is
returned as the response.

That way, the contents of the `m3u` file is base64-encoded, and set as
part of the response, for `mod_zip` to include it in the ZIP file.

[1] https://github.com/evanmiller/mod_zip
2024-08-20 22:39:33 -03:00
Georges-Antoine Assi
49e493802f Skip compressed files if theyre invalid 2024-08-18 14:14:38 -04:00
Michael Manganiello
0fdbbe4625 misc: Upgrade Python to v3.12 and Alpine to v3.20
Included upgrades:
* Python: v3.12
* Alpine: v3.20 (which uses Python 3.12)
* nginx: v1.27.1
2024-08-15 20:14:32 -03:00
Georges-Antoine Assi
bc08e05a19 changes from self review 2024-08-11 23:09:58 -04:00
Georges-Antoine Assi
1ea1b326d3 move hashes to rom model 2024-08-11 22:38:22 -04:00
Georges-Antoine Assi
56037070fb only calc hashes exlpicit 2024-08-11 19:36:52 -04:00
Georges-Antoine Assi
7e086cec67 fixes from code review 2024-08-11 19:06:16 -04:00
Georges-Antoine Assi
195b86b573 even more cleanup 2024-08-09 11:46:07 -04:00
Georges-Antoine Assi
f01f5ce5b5 trunk fixes 2024-07-27 16:38:57 -04:00
Georges-Antoine Assi
9386ca9e4a changes from self-review 2024-07-27 16:31:32 -04:00
Georges-Antoine Assi
3a9cef24e0 get it all wokring 2024-07-27 13:30:52 -04:00
Georges-Antoine Assi
db1787dff4 fix trunk issues 2024-07-18 22:03:08 -04:00
Georges-Antoine Assi
77afe55625 add support for extracting content in compressed files 2024-07-18 21:50:24 -04:00
Georges-Antoine Assi
591e41a4d5 start checking for specific file exts 2024-07-18 20:51:48 -04:00
Georges-Antoine Assi
2e91c440e3 Read into mem in chunks 2024-07-16 17:42:25 -04:00
Georges-Antoine Assi
2b2ff875ee Calculate hashes for rom files 2024-07-15 18:27:29 -04:00
Michael Manganiello
f20a9ffe34 fix: Avoid recursive os.walk calls
`os.walk` is a generator that can iteratively navigate from the
specified path, top-bottom. However, most of the calls to `os.walk` in
the project cast the call to `list()`, which makes it traverse the path
and recursively find all nested directories.

This is commonly not needed, as we end up just using a `[0]` index to
only access the root path.

This change adds a few utils that simplifies listing files/directories,
and by default does it non-recursively. Performance gains shouldn't be
noticeable in systems with high-speed storage, but we can avoid the edge
cases of users having too many nested directories, by avoiding unneeded
I/O.
2024-07-13 15:30:04 -03:00
Michael Manganiello
fc53d77a58 misc: Compile constant regular expressions
Improve efficiency on reusable regular expressions, by compiling them.
2024-06-23 00:36:22 -03:00
Georges-Antoine Assi
b2085f87a8 bunch of fixes for trunk 2024-05-21 17:10:11 -04:00
Georges-Antoine Assi
a7cf0d389a run trunk format on all files 2024-05-21 10:18:13 -04:00
Georges-Antoine Assi
3e42f3ab56 also ignore firmware files on scan that match 2024-05-16 23:12:27 -04:00
Georges-Antoine Assi
dc33054ba1 more name refactoring 2024-05-05 16:45:58 -04:00