innfeed



INNFEED(1)                                                          INNFEED(1)




NAME

       innfeed - multi-host, multi-connection, streaming NNTP feeder.


SYNOPSIS

       innfeed  [  -a spool-dir ] [ -b directory ] [ -C ] [ -c filename ] [ -d
       num ] [ -e bytes ] [ -h ] [ -l filename ] [ -m ] [ -M ] [ -o bytes ]  [
       -p file ] [ -S file ] [ -x ] [ -y ] [ -z ] [ -v ] [ file ]


DESCRIPTION

       Innfeed implements the NNTP protocol for transferring news between com-
       puters.  It  handles  the  standard  IHAVE  protocol  as  well  as  the
       CHECK/TAKETHIS  streaming  extension.  Innfeed  can  feed any number of
       remote hosts at once and will open multiple connections to each host if
       configured  to  do  so. The only limitations are the process limits for
       open file descriptors and memory.

       As an alternative to using NNTP, INN may also be fed to an IMAP server.
       This is done by using an executable called imapfeed, which is identical
       to innfeed except for the delivery process.  The new  version  has  two
       types  of  connections:  an LMTP connection to deliver regular messages
       and an IMAP connection to handle  control  messages.  The  startinnfeed
       process  can  then  be told to start imapfeed instead of innfeed.  (See
       the INSTALL file for how to do this.)


MODES

       Innfeed has three modes of operation: channel, funnel-file and batch.

       Channel mode is used when no filename is given on the command line, the
       ‘‘input-file’’  keyword is not given in the config file, and the ‘‘-x’’
       option is not given.  In channel mode innfeed runs with stdin connected
       via  a pipe to innd. Whenever innd closes this pipe (and it has several
       reasons during normal processing to do so), innfeed will exit. It first
       will  try to finish sending all articles it was in the middle of trans-
       mitting, before issuing a QUIT command. This means innfeed may  take  a
       while  to  exit  depending  on how slow your peers are. It never (well,
       almost never) just drops the connection.

       Funnel-file mode is used when a filename is given as an argument or the
       ‘‘input-file’’  keyword  is  given  in the config file.  In funnel file
       mode it reads the specified file for the same formatted information  as
       innd  would  give in channel mode. It is expected that innd is continu-
       ally writing to this file, so when innfeed reaches the end of the  file
       it  will  check periodically for new information. To prevent the funnel
       file from growing without bounds, you will need  to  periodically  move
       the  file  to  the  side  (or simply remove it) and have innd flush the
       file. Then, after the file is flushed by innd, you can send  innfeed  a
       SIGALRM,  and  it too will close the file and open the new file created
       by innd. Something like:

              innfeed -p /var/run/news/innfeed.pid my-funnel-file &
              while true; do
                   sleep 43200
                   rm -f my-funnel-file
                   ctlinnd flush funnel-file-site
                   kill -ALRM ‘cat /var/run/news/innfeed.pid‘
              done

       Batch mode is used when the ‘‘-x’’ flag is used.  In batch mode innfeed
       will  ignore  stdin,  and  will simply process any backlog created by a
       previously running innfeed. This mode is not normally needed as innfeed
       will take care of backlog processing.


CONFIGURATION

       Innfeed  expects  a  couple  of  things  to be able to run correctly: a
       directory where it can store backlog files and a configuration file  to
       describe which peers it should handle.

       The  configuration  file  is  described in innfeed.conf(5).  The ‘‘-c’’
       option can be used to specify a different file.

       For each peer (say, ‘‘foo’’), innfeed manages up  to  4  files  in  the
       backlog  directory: a ‘‘foo.lock’’ file, which prevents other instances
       of innfeed from interfering with this one; a ‘‘foo.input’’  file  which
       has  old  article  information  innfeed is reading for re-processing; a
       ‘‘foo.output’’ file where innfeed is writing  information  on  articles
       that  couldn’t  be  processed (normally due to a slow or blocked peer);
       and a ‘‘foo’’ file.

       This last file (‘‘foo’’) is never created by innfeed,  but  if  innfeed
       notices  it, it will rename it to ‘‘foo.input’’ at the next opportunity
       and will start reading from it. This lets you create a batch  file  and
       put  it  in  a place where innfeed will find it. You should never alter
       the .input or .output files of a running innfeed.

       The format of these last three files is:

              /path/to/article <message-id>

       This is the same as the first two fields of the  lines  innd  feeds  to
       innfeed, and the same as the first two fields of the lines of the batch
       file innd will write if innfeed is unavailable for  some  reason.  When
       innfeed  processes  its own batch files it ignores everything after the
       first two whitespace separated fields, so moving the innd-created batch
       file  to  the  appropriate  spot  will work, even though the lines have
       extra fields.

       The first field can also be a storage API  token.   The  two  types  of
       lines  can  be  intermingled;  innfeed  will use the storage manager if
       appropriate and otherwise treat the first field as a filename  to  read
       directly.

       Innfeed  writes  its  current status to the file ‘‘innfeed.status’’ (or
       the file given by the ‘‘-S’’ option). This file contains details on the
       process as a whole, and on each peer this instance of innfeed is manag-
       ing.

       If innfeed is told to send an article to a host  it  is  not  managing,
       then  the article information will be put into a file matching the pat-
       tern ‘‘innfeed-dropped.*’’, with part of the file name matching the pid
       of the innfeed process that is writing to it.  Innfeed will not process
       this file except to write to it. If nothing is written to the file then
       it will be removed if innfeed exits normally.


SIGNALS

       Upon  receipt of a SIGALRM innfeed will close the funnel-file specified
       on the command line, and will reopen it (see  funnel  file  description
       above).

       Innfeed  with catch SIGINT and will write a large debugging snapshot of
       the state of the running system.

       Innfeed will catch SIGHUP and will reload the config  file.   See  inn-
       feed.conf(5) for more details.

       Innfeed will catch SIGCHLD and will close and reopen all backlog files.

       Innfeed will catch SIGTERM and will do an orderly shutdown.

       Upon receipt of a SIGUSR1 innfeed will increment the debugging level by
       one; receipt of a SIGUSR2 will decrement it by one. The debugging level
       starts at zero (unless the ‘‘-d’’ option it used),  in  which  case  no
       debugging  information  is  emitted. A larger value for the level means
       more debugging information. Numbers up to 5 are currently useful.


SYSLOG ENTRIES

       There are 3 different categories  of  syslog  entries  for  statistics:
       Host, Connection and Global.

       The Host statistics are generated for a given peer at regular intervals
       after the first connection is made (or, if the remote  is  unreachable,
       after  spooling  starts). The Host statistics give totals over all Con-
       nections that have been active during the given time frame. For example
       (broken here to fit the page, with ‘‘vixie’’ being the peer):

         May 23 12:49:08 data innfeed[16015]: vixie checkpoint
                 seconds 1381 offered 2744 accepted 1286
                 refused 1021 rejected 437 missing 0 spooled 990
                 on_close 0 unspooled 240 deferred 10 requeued 25
                 queue 42.1/100:14,35,13,4,24,10

       These meanings of these fields are:

       seconds   The  time  since  innfeed  connected to the host or since the
                 statistics were reset by a ‘‘final’’ log entry.

       offered   The number of IHAVE commands sent to the host if it is not in
                 streaming  mode.   The sum of the number of TAKETHIS commands
                 sent when no-CHECK mode is in effect plus  the  number  CHECK
                 commands sent in streaming mode (when no-CHECK mode is not in
                 effect).

       accepted  The number of articles which were sent to the remote host and
                 accepted by it.

       refused   The  number  of articles offered to the host that it it indi-
                 cated it didn’t want because it had already seen the Message-
                 ID.  The remote host indicates this by sending a 435 response
                 to an IHAVE command or a 438 response to a CHECK command.

       rejected  The number of articles transferred to the host  that  it  did
                 not  accept  because it determined either that it already had
                 the article or it did not want it because  of  the  article’s
                 Newsgroups:  or  Distribution: headers, etc.  The remote host
                 indicates that it is rejecting the article by sending  a  437
                 or 439 response after innfeed sent the entire article.

       missing   The number of articles which innfeed was told to offer to the
                 host but which were not present in the article spool.   These
                 articles  were  probably  cancelled or expired before innfeed
                 was able to offer them to the host.

       spooled   The number of article entries that were written to the  .out-
                 put  backlog  file  because  the articles could not either be
                 sent to the host or be refused by it.  Articles are generally
                 spooled either because new articles are arriving more quickly
                 than they can be offered to  the  host,  or  because  innfeed
                 closed  all  the  connections  to the host and pushed all the
                 articles currently in progress to the .output backlog file.

       on_close  The number of articles that were spooled when innfeed  closed
                 all the connections to the host.

       unspooled The  number of article entries that were read from the .input
                 backlog file.

       deferred  The number of articles that the host told  innfeed  to  retry
                 later  by sending a 431 or 436 response.  Innfeed immediately
                 puts these articles back on the tail of the queue.

       requeued  The number of articles that were in progress  on  connections
                 when  innfeed  dropped those connections and put the articles
                 back on the queue.  These connections may have been broken by
                 a  network  problem or became unresponsive causing innfeed to
                 time them out.

       queue     The first number is the average (mean) queue size during  the
                 previous  logging interval.  The second number is the maximum
                 allowable queue size.  The third number is the percentage  of
                 the  time  that the queue was empty.  The fourth through sev-
                 enth numbers are the percentages of the time that  the  queue
                 was  >0%  to  25% full, 25% to 50% full, 50% to 75% full, and
                 75% to <100% full.  The last number is the percentage of  the
                 time that the queue was totally full.

       If  the ‘‘-z’’ option is used (see below), then when the peer stats are
       generated, each Connection will log its stats  too.  For  example,  for
       connection number zero (from a set of five):

         May 23 12:49:08 data innfeed[16015]: vixie:0 checkpoint
                 seconds 1381 offered 596 accepted 274
                 refused 225 rejected 97

       If  you  only  open a maximum of one Connection to a remote, then there
       will be a close correlation between Connection numbers  and  Host  num-
       bers,  but  in general you can’t tie the two sets of number together in
       any easy or very meaningful way.  When  a  Connection  closes  it  will
       always log its stats.

       If  all  Connections for a Host get closed together, then the Host logs
       its stats as ‘‘final’’ and resets its counters. If the feed is so  busy
       that  there’s  always  at  least  one Connection open and running, then
       after some amount of time (set via the config file), the Host stats are
       logged  as  final  and  reset.  This is to make generating higher level
       stats from log files, by other programs, easier.

       There is one log entry that is emitted for a Host just after  its  last
       Connection closes and innfeed is preparing to exit. This entry contains
       counts over the entire life of the process. The  ‘‘seconds’’  field  is
       from  the  first time a Connection was successfully built, or the first
       time spooling started. If a Host has been completely idle, it will have
       no such log entry.

         May 23 12:49:08 data innfeed[16015]: decwrl global
                 seconds 1381 offered 34 accepted 22
                 refused 3 rejected 7 missing 0

       The  final log entry is emitted immediately before exiting. It contains
       a summary of the statistics over the entire life of the process.

         Feb 13 14:43:41 data innfeed-0.9.4[22344]: ME global
                       seconds 15742 offered 273441 accepted 45750
                       refused 222008 rejected 3334 missing 217



OPTIONS

       -a     The ‘‘-a’’ flag is used to specify the top of the article  spool
              tree.  Innfeed  does  a chdir(2) to this directory, so it should
              probably  be  an  absolute  path.  The  default  is   <patharti-
              cles in inn.conf>.

       -b     The ‘‘-b’’ flag may be used to specify a different directory for
              backlog file storage and retrieval. If the path is relative then
              it is relative to <pathspool in inn.conf>. The default is ‘‘inn-
              feed’’.

       -c     The ‘‘-c’’ flag may be used to specify a different  config  file
              from the default value. If the path is relative then it is rela-
              tive to <pathetc in inn.conf>. The default is  ‘‘innfeed.conf’’.

       -C     The  ‘‘-C’’ flag is used to have innfeed simply check the config
              file, report on any errors and then exit.

       -d     The ‘‘-d’’ flag may be  used  to  specify  the  initial  logging
              level.  All  debugging  messages  go to stderr (which may not be
              what you want, see the ‘‘-l’’ flag below).

       -e     The ‘‘-e’’ flag may be used to specify the size limit (in bytes)
              for  the  .output  backlog  files innfeed creates. If the output
              file gets bigger than 10% more than the  given  number,  innfeed
              will  replace the output file with the tail of the original ver-
              sion. The default value is 0, which means there is no limit.

       -h     Use the ‘‘-h’’ flag to print the usage message.

       -l     The  ‘‘-l’’ flag may be used to specify  a  different  log  file
              from  stderr.  As  innd  starts  innfeed with stderr attached to
              /dev/null, using this option  can  be  useful  in  catching  any
              abnormal  error  messages, or any debugging messages (all ‘‘nor-
              mal’’ errors messages go to syslog).

       -M     If innfeed has been built with mmap  support,  then  the  ‘‘-M’’
              flag turns OFF the use of mmap(); otherwise it has no effect.

       -m     The  ‘‘-m’’ flag is used to turn on logging of all missing arti-
              cles. Normally if an article is missing, innfeed keeps a  count,
              but logs no further information. When this flag is used, details
              about message-id and expected pathname are logged.

       -o     The ‘‘-o’’ flag sets a value of the maximum number of  bytes  of
              article data innfeed is supposed to keep in memory. This doesn’t
              work properly yet.

       -p     The ‘‘-p’’ flag is used to specify the filename to write the pid
              of   the   process   into.   A  relative  path  is  relative  to
              <pathrun in inn.conf>. The default is ‘‘innfeed.pid’’.

       -S     The ‘‘-S’’ flag specifies the name of  the  file  to  write  the
              periodic staus to. If the path is relative it is considered rel-
              ative to <pathlog in inn.conf>. The  default  is  ‘‘innfeed.sta-
              tus’’.

       -v     When the ‘‘-v’’ flag is given, version information is printed to
              stderr and then innfeed exits.

       -x     The ‘‘-x’’ flag is used to tell innfeed not to expect any  arti-
              cle  information from innd but just to process any backlog files
              that exist and then exit.

       -y     The ‘‘-y’’ flag is used to allow dynamic peer binding.  If  this
              flag  is used and article information is received from innd that
              specifies an unknown peer, then the peer name is taken to be the
              IP  name  too, and an association with it is created. Using this
              it is possible to only have the  global  defaults  in  the  inn-
              feed.conf  file,  provided  the  peername as used by innd is the
              same as the ip name.  Note that innfeed with ‘‘-y’’ and no  peer
              in  innfeed.conf  would  cause  a problem that innfeed drops the
              first article.

       -z     The ‘‘-z’’ flag is used to cause each connection, in a  parallel
              feed configuration, to report statistics when the controller for
              the connections prints its statistics.


       BUGS

       When using the ‘‘-x’’ option, the config file entry’s ‘‘initial-connec-
       tions’’ field will be the total number of connections created and used,
       no matter how many big the batch file, and no matter how big the ‘‘max-
       connectiond’’  field specifies. Thus a value of 0 for ‘‘initial-connec-
       tions’’ means nothing will happen in ‘‘-x’’ mode.

       Innfeed does not automatically grab the  file  out  of  out.going--this
       needs to be prepared for it by external means.

       Probably too many other bugs to count.


FILES

       infeed.conf    config file.
       innfeed        directory for backlog files.


HISTORY

       Written  by  James Brister <brister@vix.com> for InterNetNews.  This is
       revision 1.9, dated 2002/12/03.


SEE ALSO

       innfeed.conf(5)



                                                                    INNFEED(1)

Man(1) output converted with man2html