AtOM/README.md

# AtOM: Anything to Ogg and Mp3

URL: https://forge.riquer.fr/p/AtOM/
Author: Vincent Riquer <vincent+prog.atom@riquer.fr>
Copyright/left: 2012-2013,2015,2025 Vincent Riquer - GPLv3 (see doc/GPL-3)
	except: transogg: WTFPL 2.0

## Dependencies
### Required:
* bash (>= 4.0)
  http://www.gnu.org/software/bash/bash.html
* SoX
  http://sox.sourceforge.net/
* SQLite
  http://www.sqlite.org/

### Optional:
* vorbis-tools
  http://www.vorbis.com/
    * `ogginfo` (Ogg Vorbis metadata)
    * `oggenc` (Ogg Vorbis encoding)
* opus-tools
  http://opus-codec.org/
    * `opusinfo` (Opus metadata)
    * `opusenc` (Opus encoding)
    * `opusdec` (Opus decoding)
* LAME MP3 Encoder
  http://lame.sourceforge.net/
    * `lame` (MP3 encoding)
* FLAC
  http://flac.sourceforge.net/
    * `metaflac` (FLAC metadata)
* Musepack
  http://www.musepack.net/
    * `mpcdec` (Musepack decoding)
* FFmpeg
  http://ffmpeg.org/
    * `ffprobe` (ID3v2, Musepack, Windows Media and video metadata)
    * `ffmpeg` (Windows Media and video decoding)

## Using the software

### Configuration:
On first run, AtOM will ask a set of questions to help you create a
configuration file.
You can run `atom -S` at any time to re-run the setup. It will be prefilled with
your current configuration.

If, however, you still want to make changes manually, please read doc/config.
There are a lot of comments in the generated config file too.

### Preparing data:
Nothing specific needs to be done. You can edit ypur tags, rename files, move
them around how you see fit. However, make sure you setup your tag editor
to *do* update the files' timestamps: though it was initially plan to make this
optional, using checksums or tags, it was abandoned due to the huge amount of
IO required.

### Running:
Make sure your configuration is correct by running
```
$ atom -C
```
This will produce a human-readable dump of your current configuration.
If all settings are correct, simply run atom with no argument. Go get a beer.
Meet some friends. Go to bed. Depending on the size of your collection, the
first run can take hours, even days.
After adding/tagging/renaming/deleting files, just re-run atom. It should be
much faster this time, as only changed data will be treated.

If, for whatever reason, you need to force the regeneration of a destination,
after changing the quality settings for example, run
```
$ atom -F <destination name>
```

### Running as a cronjob:
If you want to run AtOM as a cronjob, `atom -q` will give you a cleaner output,
more suitable for mail or logfile output. You may also want to limit the size of
each batch with `-B <batch size>`. AtOM will not create or update more than
`<batch size>` destination files.

For example:
```
#m  h  dom mon dow  command
0   5   *   *   *   atom -B 1000 -q
```

## Technical details
### I. Source scan
After reading its configuration file, AtOM uses find to get a list of all files
in the source directory.
Each file is checked against the database. If it's already there, and its last
modification time is unchanged, the last_seen field is updated, and that's all.
If its mtime has changed, mime-type scan is attempted. It is updated in the
database, along with last_seen.
If the file is new, its mime-type is scanned, and it is added to the database.

### II. Obsolete files
Using the last_seen field, AtOM removes from destinations each files which are
not present anymore in the source directory. AtOM never touches files not
present in its database (unless there is a filename conflict, in which case your
file *WILL* be overwritten). If you wish to clear unknown files from your
destinations, have a look at toys/cleandestinations.

### III. Reading metadata
AtOM then tries to read metadata from each new or changed file. It also re-reads
metadata from files scanned with an older version of AtOM, if the parser for
that format has changed. The actual data read depends on the format, but at the
very least, AtOM should identify the sampling rate, bitrate and number of
channels. Unknown file types are scanned with `ffprobe`, so you may still have
some luck, depending on your FFmpeg setup.

### IV. Task creation
For every destination files having their last change field different from their
corresponding source file entry, we create one or more tasks, to generate or
update (overwrite) the destination file. AtOM tries to generate as few tasks as
possible, by reusing intermediate files wherever feasible. E.g. if you define
two destinations, only differing by the encoding format, we will create only one
decoding task, and two encoding tasks. On average, the number of generated tasks
shall always be less than 2*n (where n is destinations*file count), unless each
of them uses different sampling rate/normalization parameters.

Files matching the format, sampling-rate, channel count and bitrate constraints
are copied (symlinked where possible) during that stage ("immediate copies").
(also see higher-than setting).

The steps required for each file depend on the format and destination parameters
(resampling, aso.). Basically, one destination file requires 1 (if reusing
decoded/resampled file) to 3 (if format can't be decoded using sox and
resampling/normalization is required) tasks.

### V. Running tasks

#### V.1 Running tasks
Progress display:
```
L:<current load>/<max-load> W:<active workers>/<concurrency> T:<last task>/<task count> (F:<failed tasks>) <pct>% <remaining time> (A:<average task duration>s/task) ETA:<estimated time of arrival>
```

#### V.2 Renaming files
If rename pattern (or FAT32 compatibility) for one or more destinations has
changed, files already transcoded will be renamed. Otherwise, this step is
skipped.

### VI. Copies
During that stage, files which mime-types matched copy_mime-type directives are
copied (symlinked where possible) to the destination.
When a rename pattern is defined and impacts path, files that are not in the
same directories as files which have been successfully transcoded are ignored,
as AtOM cannot guess what the destination path should be. Otherwise, files are
copied with their name and path unchanged.

### VII. Obsolete files 2
Whenever a file is transcoded, if it was already present in the database but its
name changed, following a rename pattern change, the old file is removed during
that stage.

## Toys
AtOM requires a database to function. Now that we have a database containing
various information about our media files, why not use it?
AtOM comes with a small set of tools in the toys/ directory. These are
documented in `toys/README`.

# Shameless Self Promotion
I am the author of free (Creative Commons CC-By-SA) music which you can stream
for free, or buy to get high quality and bonuses from
[Bandcamp](http://djblackred.bandcamp.com). If you like electronic music taking
its inspiration from Trance, Drum & Bass, Ambient and (rarely) Free Jazz, please
check it out!
Downloads are available in FLAC, Ogg, MP3, and more, and includes the "source
code" (sequencer files and the likes) for most tracks.
I am receiving 80% of the money you'll spend, so you won't be feeding some
greedy BigCorp producer or distributor.
And if you don't like it, you can still spread the word to friends who may like.
You can see this as a way to thank me for this piece of code.

# Legal
Some of the format and/or tool names cited above are trademarks belonging to
their rightful owners. AtOM and its authors are not linked in any way to
those companies or individuals. Said companies do not endorse nor support
AtOM in any way.