Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions MANIFEST
Original file line number Diff line number Diff line change
Expand Up @@ -441,6 +441,7 @@ include/inn/secrets.h Header file for the secrets struct
include/inn/sequence.h Header file for sequence space arithmetic
include/inn/storage.h Header file for storage API
include/inn/timer.h Header file for generic timers
include/inn/tombstone.h Header file for cancel tombstone log
include/inn/tst.h Header file for ternary search tries
include/inn/utility.h Header file for utility functions
include/inn/vector.h Header file for vectors of strings
Expand Down Expand Up @@ -805,6 +806,7 @@ storage/timehash timehash storage method (Directory)
storage/timehash/method.config buildconfig definition
storage/timehash/timehash.c timehash storage routines
storage/timehash/timehash.h Header for timehash
storage/tombstone.c Cancel tombstone log helpers
storage/tradindexed tradindexed overview method (Directory)
storage/tradindexed/ovmethod.config buildconfig definition
storage/tradindexed/ovmethod.mk Make rules for tradindexed overview
Expand Down Expand Up @@ -931,6 +933,10 @@ tests/data/upgrade/readers.conf.ok Fixed readers.conf file
tests/data/upgrade/sasl.conf Obsolete sasl.conf config file
tests/docs Test suite for documentation (Directory)
tests/docs/pod.t.in Tests for POD formatting
tests/expire Test suite for expire (Directory)
tests/expire/tombstone-e2e.t End-to-end tests for tombstone log
tests/expire/tombstone-hisexpire-t.c HISexpire integration test for tombstone
tests/expire/tombstone-t.c Tests for tombstone library
tests/innd Test suite for innd (Directory)
tests/innd/artparse-t.c Tests for ARTparse in innd
tests/innd/chan-t.c Tests for CHAN functions in innd
Expand Down Expand Up @@ -997,6 +1003,7 @@ tests/perl/minimum-version.t.in Tests for not too-new features of Perl
tests/runtests.c The test suite driver program
tests/storage Test suite for storage (Directory)
tests/storage/archive.t Tests for backends/archive
tests/storage/cancel-tombstone-t.c Tests for SMcanceltombstone
tests/storage/makehistory.t Tests for expire/makehistory
tests/storage/sm.t Tests for frontends/sm
tests/tap Helper scripts for TAP (Directory)
Expand Down
12 changes: 12 additions & 0 deletions doc/pod/expire.pod
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,18 @@ is specified, the file I<pathetc>/expire.ctl is read.

=back

=head1 TOMBSTONE LOG

When I<expiretombstone> is enabled in F<inn.conf>, B<expire> consumes
the per-cycle deletion log produced by B<expireover>, B<innd>, and
B<sm -r> (F<I<pathdb>/expireover.tombstone> and
F<I<pathdb>/cancels.tombstone>) so it can drop history entries for
those articles without doing a per-article C<SMretrieve(RETR_STAT)>
syscall. An empty tombstone is treated as "no cancels this cycle"
and the slow scan is skipped entirely. See inn.conf(5) under
I<expiretombstone> for the file lifecycle, locking model, and
recovery story.

=head1 HISTORY

Written by Rich $alz <rsalz@uunet.uu.net> for InterNetNews. Converted to
Expand Down
8 changes: 8 additions & 0 deletions doc/pod/expireover.pod
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,14 @@ F<expire.ctl>. Otherwise it only removes overview entries for articles
that have already been removed by some other process, and B<-e>, B<-k>,
B<-N>, B<-p>, B<-q>, B<-w>, and B<-z> are all ignored.

When I<expiretombstone> is enabled in F<inn.conf>, B<expireover>
appends each cancelled token to F<I<pathdb>/expireover.tombstone.NEW>
under an exclusive POSIX lock and atomically renames the file into
place on a clean run. In delayrm mode (B<-z>), the rename is
performed by B<expirerm> after B<fastrm> succeeds. The next
B<expire> run consumes this log to skip per-article storage existence
checks. See inn.conf(5) under I<expiretombstone>.

When I<groupbaseexpiry> is set, the default behavior of B<expireover> is
to remove the article from the spool once it expires out of all of the
newsgroups to which it was crossposted. The article is, however, removed
Expand Down
124 changes: 120 additions & 4 deletions doc/pod/inn.conf.pod
Original file line number Diff line number Diff line change
Expand Up @@ -632,6 +632,82 @@ in F<news.err> at startup and the unrecognized fields will be discarded.
Moreover, the deprecated C<Bytes> and C<Lines> header fields, already present
in the standard overview fields as metadata items, cannot be added.

=item I<expiretombstone>

Whether INN tools record cancellation tombstones so a subsequent
B<expire> run can skip per-article storage existence checks. When
enabled (and I<groupbaseexpiry> is also true), two log files in
I<pathdb> capture every cancellation:

=over 4

=item F<I<pathdb>/expireover.tombstone>

Written by B<expireover> after each successful B<SMcancel> in
group-based expiry, atomically renamed into place on a clean run.
When B<expireover> runs with C<-z> (delayed removal), the
B<SMcancel> calls are deferred to B<fastrm> via B<expirerm>;
B<expireover> writes the entries up front and B<expirerm> performs
the atomic rename after B<fastrm> succeeds, so the same speedup
applies to delayrm setups. After B<expire> consumes this file it
is unlinked and re-seeded as a header-only successor, matching
the on-disk presence of F<cancels.tombstone>; the next
B<expireover> run overwrites the seeded file with its full
content.

=item F<I<pathdb>/cancels.tombstone>

Appended continuously by B<innd> when it processes cancel control
messages, and by B<sm -r> for manual cancellations. Appenders take
a shared fcntl POSIX lock; B<expire> snapshots the file by atomic
rename to F<cancels.tombstone.processing> under an exclusive fcntl
lock, unlinks the snapshot after a successful consume, then
recreates F<cancels.tombstone> as a header-only file under an
exclusive lock so B<nnrpd>'s per-connection fast path stays
active through quiet inter-cancel periods. An appender that
raced a cancel into a new live file between the rename and the
recreate has its content preserved verbatim below the restored
header. Append atomicity
relies on POSIX guaranteeing that each C<write(2)> to a regular
file opened with C<O_APPEND> is atomic with respect to other
writers; this holds for any size of single C<write> on local
filesystems. Cross-client atomicity over NFS is not guaranteed:
if I<pathdb> is on NFS, lines from concurrent writers on different
clients can in theory interleave. In practice INN's cancel
sources (B<innd> on a single host plus occasional B<sm -r> on the
same host) write from one client.

=back

The next B<expire> invocation loads both files into a single hashset
and treats every article in either log as already gone, avoiding an
C<SMretrieve(RETR_STAT)> call per history entry. For storage methods
where the stat is a file-system call (tradspool, timehash) this turns
a billion C<access(2)> calls into a few thousand hash lookups; for
storage methods that self-expire (CNFS) the stat call remains because
articles can vanish through wrap-around without going through
B<SMcancel>.

In normal operation every cancellation path participates in
tombstone tracking, so all articles removed from the spool are
recorded. Residual orphans can only accumulate from events outside
the tracked paths: a process crash in the narrow window between
B<SMcancel> and the tombstone append, manual filesystem-level
deletes that bypass B<sm -r>, or filesystem corruption. When such
orphans do appear they are harmless (B<nnrpd> returns "no such
article" to readers that hit them) and exist only as small history
entries. No regular reconciliation cadence is needed. If an
operator suspects orphan accumulation after admin intervention or a
storage incident, B<expire> can be re-run with this option disabled
in F<inn.conf> to perform an exhaustive C<SMretrieve(RETR_STAT)>
scan; this is an exceptional operation, not a scheduled one.

Footprint: ~38 bytes per entry on disk, ~50 bytes in expire's hash
table. 1M cancels per run = ~38 MB tombstone, ~50 MB hash.

This is a boolean value and the default is false; sites should opt in
after validating the option's behaviour against their workload.

=item I<groupbaseexpiry>

Whether to enable newsgroup-based expiry. If set to false, article expiry
Expand Down Expand Up @@ -890,10 +966,50 @@ Whether B<nnrpd> should check the existence of an article before listing it
as present in response to an NNTP command (HDR, LISTGROUP, NEWNEWS, OVER,
XPAT). The primary use of this setting is to prevent B<nnrpd> from returning
information about articles which are no longer present on the server but which
still have overview data available. Checking the existence of articles before
returning overview information slows down the overview commands, but reduces
the number of "article is missing" errors seen by the client. This is a
boolean value and the default is true.
still have overview data available. Checking existence with an unconditional
C<SMretrieve> slows down the overview commands; with I<expiretombstone> also
enabled (see below) the check uses an in-memory hash lookup instead and is
cheaper than disabling the check on tradspool/timehash/timecaf backends.
The trade-off remains: enabling this reduces the number of "article is
missing" errors seen by the client. This is a boolean value and the default
is true.

When I<expiretombstone> is also enabled, B<nnrpd> consults the
F<I<pathdb>/cancels.tombstone> log on the article-existence check
path: a token recorded as cancelled is reported as gone without an
C<SMretrieve> call, and a token absent from the tombstone is
trusted to still exist (skipping the syscall) for storage methods
that do not self-expire. Self-expiring backends (CNFS) still go
through C<SMretrieve> because cyclic-buffer wrap-around bypasses
the tombstone. The tombstone is loaded lazily on first use per
connection and refreshed by C<stat> on each call; the parsed
hashset is rebuilt only when the file's mtime or size changes.
Statting the same path is dentry-cache resident and far cheaper
than the per-article syscalls the fast path elides, so cancellations
recorded by other processes become visible to long-lived
connections on the next existence check.

Each B<nnrpd> connection holds an independent copy of the parsed
hashset, costing roughly 50 bytes per cancel; sites with very large
cancel volumes between B<expire> runs and many concurrent readers
should size memory accordingly. Loss relative to the unconditional
C<SMretrieve> path is bounded to out-of-band events the tombstone
cannot see (manual filesystem deletes that bypass B<sm -r>,
filesystem corruption); admin-initiated B<sm -r> and B<innd>
cancels are tracked.

Important: on non-self-expiring backends the fast path treats a
tombstone-miss as proof that the article is still on disk, without
verifying. Admins must remove articles via B<sm -r> (which records
the cancel in the tombstone), not by direct C<rm> on the spool, or
readers will be told a deleted article still exists until the next
B<expireover> reconciles overview.

Note that the fast path is also gated on I<expiretombstone>; both
must be true to take effect. The B<nnrpdoverstats> C<artcheck>
counter reflects only the slow C<SMretrieve> path, so the syslog
field will appear smaller when the fast path is doing most of the
work.

You may also want to see the I<groupexactcount> parameter in readers.conf(5)
which controls the computing of the estimated article count returned in NNTP
Expand Down
7 changes: 7 additions & 0 deletions doc/pod/sm.pod
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,13 @@ will delete the article out of the news spool and it will not subsequently
be retrievable by any part of INN. It's equivalent to C<ctlinnd cancel>
except it takes a storage API token instead of a message-ID.

When the I<expiretombstone> setting in F<inn.conf> is true, B<sm -r>
also appends the cancelled token to F<I<pathdb>/cancels.tombstone>
so a later B<expire> run can drop the corresponding history entry
without a per-article storage check. Append failures are logged
but do not affect the cancellation itself. See inn.conf(5) under
I<expiretombstone> for the full mechanism.

=item B<-H>

Retrieve only the headers of the article rather than the entire article.
Expand Down
Loading
Loading