# LaTeX-Indexer

The LaTeX-Indexer is a free, open-source, platform-independent tool designed to automate the generation of indexes for LaTeX documents. It extracts words from `.tex` files, generates frequency distributions using PGFplots, allows users to select and tag terms (including variants and sub-variants), and compiles the indexed document with MakeIndex. Released under GPL-3, it saves approximately 80% of indexing time, making document creation more efficient.

## Table of Contents
- [Installation](#installation)
- [Usage](#usage)
- [Commands](#commands)
- [Tips](#tips)
- [Limitations](#limitations)
- [Future work](#future-work)
- [Contributing](#contributing)
- [License](#license)
- [Contact](#contact)

## Installation

On macOS, install the required dependencies and the LaTeX-Indexer as follows:

```bash
brew install pandoc
curl -O https://ch.mirrors.cicku.me/ctan/indexing/latex-indexer.zip
unzip indexer.zip
```

Navigate to the directory and compile the application:

```bash
cd path/to/latex-indexer
mvn clean package
```

Finally, navigate to the `target` directory to execute the application. (Alternatively, move the generated `.jar`, to execute it from somewhere else.

Ensure you have the following prerequisites:
- An up-to-date LaTeX installation
- Pandoc
- Java Version 21 or higher

## Usage

Run the LaTeX-Indexer with:

```bash
java -jar indexer.jar /path/to/your/file
```

## Commands

The LaTeX-Indexer supports the following commands, entered at the prompt:

- **h, help**: Displays a list of all available commands with brief descriptions.
- **p, parse**: Re-parses the `.tex` document to update the word list. This runs automatically at startup but can be rerun to refresh the list.
- **l, list**: Lists parsed words with optional parameters:
  - `-n <number>` (number of words to display, default 20)
  - `-c <a|f>` (sort alphabetically or by frequency, default frequency)
  - `-p <prefix>` (filter words by prefix)
  - `-r <true|false>` (reverse order, default false)
  - `-h` for detailed help
- **g, generate**: Creates a `.tex` file with a frequency plot using PGFplots, rendered with PDFLaTeX. Supports the same parameters as `list`, plus:
  - `-f <filename>` for a custom plot file name
  - `-h` for details
- **s, subvariant**: Defines words as subvariants of a specified word, indexing them under the main word. Enter as `s <word1> <word2> ...`, then provide subvariant words when prompted. Use `-h` for help.
- **v, variation**: Defines words as variations of a specified word, indexing their occurrences under the main word. Enter as `v <word1> <word2> ...`, then provide variation words. Use `-h` for help.
- **a, add**: Automatically adds specified words to the index. Enter as `a <word1> <word2> ...`. The tool checks if words exist in the document before adding them. Use `-h` for help.
- **i, interactive**: Interactively adds a single word to the index, prompting the user to confirm each occurrence. Enter as `i <word>`. For each occurrence, the tool shows the line and context, allowing the user to choose `[Y]es`, `[N]o`, or `[A]bort`. Use `-h` for help.
- **q, quit**: Exits the program.

## Tips

While indexing an entire book at once is possible, the authors recommend processing individual chapter files for better manageability.

## Limitations

The latex indexer is built with Pandoc. Pandoc is incredibly versatile and offers support for a great number of markup formats. However, it can occur that Pandoc does not know a certain latex package. In that case, it simply ignores the code 'written in the language' of said package, i.e. it ignores environments of such a package. When this happens, Pandoc prints an extensive warning to the command line at the beginning of the program, to let the user know. 

## Future Work

In the future a possible workaround for the aforementioned problem may be to catch such a warning, to call latexmk on the specified file, and then use Pandoc on the resulting PDF to parse the content of the file. It would then however be necessary to go over the tex files with nested loop to find occurences of specified words when adding the \index{} macro, as we would not have any information about the words locations in the source file.

## Contributing

Contributions are welcome! To contribute:
1. Fork the repository.
2. Create a new branch (`git checkout -b feature/your-feature`).
3. Commit your changes (`git commit -m 'Add your feature'`).
4. Push to the branch (`git push origin feature/your-feature`).
5. Open a pull request. 


## License

This project is licensed under the GPL-3 License. See the [LICENSE](LICENSE) file for details.

## Version

1.0.1

## Contact

For questions or feedback, you are welcome to open an issue [here](https://gitlab.ti.bfh.ch/texnicians/latex-indexer)!

## Authors

David Degenhardt and Frederik Leyvraz, 2025
