Commit Graph

69 Commits

Author SHA1 Message Date
Michael Eischer 50ec408302 index: move to repository package 2024-05-25 13:13:03 +02:00
Michael Eischer 462b82a060 index: reduce size of compressed indexes
use the same index size for compressed and uncompressed indexes.
Otherwise, decoding the index of a compressed repository requires
significantly more memory.
2024-05-24 22:18:14 +02:00
Michael Eischer 2033c02b09 index: replace CountedBlobSet with AssociatedSet 2024-05-24 22:18:14 +02:00
Michael Eischer e52033a8bd index: slightly reduce Rewrite concurrency
The index operations are likely CPU-bounded. Thus, reduce the
concurrency accordingly.
2024-05-24 21:33:17 +02:00
Michael Eischer 57d69aa640 index: cleanup SaveIndex method 2024-05-24 21:33:17 +02:00
Michael Eischer 2ca1c37721 index: additional tests for new index save methods 2024-05-24 21:33:17 +02:00
Michael Eischer 5f7b48e65f index: replace Save() method with Rewrite and SaveFallback
Rewrite implements a streaming rewrite of the index that excludes the
given packs. For this it loads all index files from the repository and
only modifies those that require changes. This will reduce the index
churn when running prune. Rewrite does not require the in-memory index
and thus can drop it to significantly reduce the memory usage.

However, `prune --unsafe-recovery` cannot use this strategy and requires
a separate method to save the whole in-memory index. This is now handled
using SaveFallback.
2024-05-24 21:33:17 +02:00
Michael Eischer 72482ce5bd index: misc cleanups 2024-05-24 21:33:17 +02:00
Michael Eischer 9aa0c90fb2 index: remove supersedes field
Using the field with its current semantics is nearly impossible to get
right. Remove it as it will be replaced anyways in repository format 3.
2024-05-24 21:33:17 +02:00
Michael Eischer 550d1eeac3 repository: remove SaveIndex from interface
The method is now only indirectly accessible via Prune or RepairIndex.
2024-05-24 21:33:17 +02:00
Michael Eischer fb59e00614 index: rewrite MasterIndex load/save test to be independent of repository 2024-05-24 21:33:17 +02:00
Michael Eischer 447b486c20 index: deduplicate index loading of check and repository 2024-05-24 21:33:17 +02:00
Michael Eischer 4df887406f repository: inline MasterIndex interface into Repository interface 2024-05-24 21:33:17 +02:00
Michael Eischer 223aa22cb0 replace some uses of restic.Repository with finegrained interfaces 2024-05-18 21:42:51 +02:00
Michael Eischer 8a425c2f0a remove usages of repo.Backend() from tests 2024-05-18 21:42:51 +02:00
Michael Eischer 6328b7e1f5 replace "too small" with "too short" in error messages 2024-05-18 19:59:26 +02:00
Michael Eischer 940a3159b5 let index.Each() and pack.Size() return error on canceled context
This forces a caller to actually check that the function did complete.
2024-04-22 22:39:32 +02:00
Michael Eischer 31624aeffd Improve command shutdown on context cancellation 2024-04-22 22:31:38 +02:00
Michael Eischer 10355c3fb6 repository: Better error message if blob is larger than 4GB 2024-04-19 22:00:35 +02:00
Michael Eischer dc441c57a7 repository: unify repository initialization in tests
Tests should use a helper from internal/repository/testing.go to
construct a Repository object.
2024-03-28 23:17:02 +01:00
Michael Eischer f8852f0eb6 repair index: fix deletion of legacy indexes 2024-03-09 18:21:22 +01:00
Michael Eischer 1a8bf358f1 index: deprecate legacy index format 2024-03-09 18:21:14 +01:00
Alexander Neumann c0514dd8ba Fix linter errors (except for tests) 2024-02-10 22:58:10 +01:00
Michael Eischer bfb56b78e1 replace some usages of restic.Repository with more specific interface
This should eventually make it easier to test the code.
2024-01-27 13:02:02 +01:00
Michael Eischer cb50832d50 index: let MasterIndex.Save also delete obsolete indexes 2024-01-27 12:51:08 +01:00
Andrea Gelmini 241916d55b
Fix typos 2023-12-06 13:11:55 +01:00
Michael Eischer c7b770eb1f convert MemorizeList to be repository based
Ideally, code that uses a repository shouldn't directly interact with
the underlying backend. Thus, move MemorizeList one layer up.
2023-10-25 23:01:35 +02:00
Michael Eischer 1b8a67fe76 move Backend interface to backend package 2023-10-25 23:00:18 +02:00
Michael Eischer 91aef00df3 check: add index loading progress bar 2023-10-01 19:55:29 +02:00
Michael Eischer 3fd0ad7448 repository: list index files only once 2023-10-01 19:53:26 +02:00
arjunajesh ed65a7dbca implement progress bar for index loading 2023-10-01 19:52:59 +02:00
Michael Eischer 2bec99dc6f master_index: fix inconsistent length blob length in test
Two blobs with the same hash must always have the same content length.
2023-08-19 20:04:25 +02:00
Michael Eischer 090f9d6237 restic: Cleanup and simplify TestCreateSnapshot 2023-07-22 19:55:57 +02:00
Michael Eischer b2ed42cec4 index: add basic hat test 2023-06-16 23:12:30 +02:00
Michael Eischer 55c21846b1 Revert "index: remove redundant storage of indexmap size"
This reverts commit f1c388c623.

For an uninitialized indexmap the returned size was `-1` which is
unexpected and could cause problems.
2023-06-08 18:08:46 +02:00
Michael Eischer ac1dfc99bb index: fix blocklist size 2023-06-02 19:39:12 +02:00
Michael Eischer 9a7056a479 index: implement indexmap.grow() without random access 2023-05-30 20:13:33 +02:00
Michael Eischer fc05e35a08 index: let indexmap.Each iterate in allocation order
Iterating through the indexmap according to the bucket order has the
problem that all indexEntries are accessed in random order which is
rather cache inefficient.

As we already keep a list of all allocated blocks, just iterate through
it. This allows iterating through a batch of indexEntries without random
memory accesses. In addition, the packID will likely remain similar
across multiple blobs as all blobs of a pack file are added as a single
batch.
2023-05-30 20:12:36 +02:00
Michael Eischer f1c388c623 index: remove redundant storage of indexmap size 2023-05-30 20:11:53 +02:00
Michael Eischer 12141afbad index: Allow inlining of HAT 2023-05-30 20:11:14 +02:00
Michael Eischer fed33295c3 index: store indexEntries in hashed array tree
This data structure reduces the wasted memory to O(sqrt(n)). The
top-layer of the hashed array tree (HAT) also has a size of O(sqrt(n)),
which makes it cache efficient. The top-layer should be small enough to
easily fit into the CPU cache and thus only adds little overhead
compared to directly accessing an index entry via a pointer.
2023-05-29 00:24:15 +02:00
Michael Eischer b217f38ee7 index: Remove pointers from within indexentrys
The indexEntry objects are now allocated in a separate array. References
to an indexEntry are now stored as array indices. This has the benefit
of allowing the garbage collector to ignore the indexEntry objects as
these do not contain pointers and are part of a single large allocation.
2023-05-29 00:24:15 +02:00
Michael Eischer 0c1240360d index: add garbage collection benchmark
Allocates an index and repeatedly triggers the GC.
2023-05-29 00:23:04 +02:00
Michael Eischer 0058745881 test: use parameter instead of hardcoded constant 2023-05-18 21:17:53 +02:00
greatroar d129baba7a repository: Reuse buffers in Repository.LoadUnpacked
This method had a buffer argument, but that was nil at all call sites.
That's removed, and instead LoadUnpacked now reuses whatever it
allocates inside its retry loop.
2023-01-30 22:01:01 +01:00
greatroar 99755c634b index: Optimize generatePackList
name                 old time/op    new time/op    delta
EncodeIndex/100-8      1.56ms ± 2%    1.48ms ± 3%   -5.37%  (p=0.000 n=10+10)
EncodeIndex/1000-8     14.5ms ± 2%    13.1ms ± 2%   -9.49%  (p=0.000 n=9+10)
EncodeIndex/10000-8     120ms ± 2%     116ms ± 2%   -3.58%  (p=0.000 n=10+10)

name                 old alloc/op   new alloc/op   delta
EncodeIndex/100-8       306kB ± 1%     275kB ± 1%  -10.28%  (p=0.000 n=10+10)
EncodeIndex/1000-8     3.69MB ±11%    2.88MB ± 5%  -22.07%  (p=0.000 n=10+9)
EncodeIndex/10000-8    35.9MB ±11%    31.9MB ±10%  -11.13%  (p=0.005 n=10+10)

name                 old allocs/op  new allocs/op  delta
EncodeIndex/100-8       3.39k ± 0%     2.39k ± 0%  -29.61%  (p=0.000 n=10+10)
EncodeIndex/1000-8      32.6k ± 0%     22.9k ± 0%  -29.63%  (p=0.000 n=10+9)
EncodeIndex/10000-8      326k ± 0%      229k ± 0%  -29.71%  (p=0.000 n=10+10)

The bulk of the allocation rate improvement comes from just removing the
debug.Log calls: every one of those copied a restic.ID to the heap.
2023-01-14 20:41:07 +01:00
greatroar c0b5ec55ab repository: Remove empty cleanup functions in tests
TestRepository and its variants always returned no-op cleanup functions.
If they ever do need to do cleanup, using testing.T.Cleanup is easier
than passing these functions around.
2022-12-11 11:06:25 +01:00
greatroar 0e8893dae9 index: Compact data structure for Index.EachByPack 2022-10-29 23:09:17 +02:00
Michael Eischer ae45f3b04f restic: Unify code to load Index/Lock/Snapshot 2022-10-21 21:25:11 +02:00
Michael Eischer 2e3f1c08c5 repository: split index into a separate package 2022-10-08 21:15:34 +02:00