# Black Star Information Extraction The Black Star Information Extraction (BSIE) package provides a pipeline to extract metadata and content-derived features from files and stores that information in a BSFS storage. ## Installation You can install BSIE via pip. BSIE comes with support for various file formats. For this, it needs to install many external packages. BSIE lets you control which of these you want to install. Note that if you choose to not install support for some file types, BSIE will show a warning and skip them. All other formats will be processed normally. To install only the minimally required software, use: $ pip install --extra-index-url https://pip.bsfs.io bsie To install all dependencies, use the following shortcut: $ pip install --extra-index-url https://pip.bsfs.io bsie[all] To install a subset of all dependencies, modify the extras part (``[image, preview]``) of the follwing command to your liking: $ pip install --extra-index-url https://pip.bsfs.io bsie[image,preview] Currently, BSIE providesthe following extra flags: * image: Read data from image files. Note that you may also have to install ``exiftool`` through your system's package manager (e.g. ``sudo apt install exiftool``). * preview: Create previews from a variety of files. Note that support for various file formats also depends on what system packages you've installed. You should at least install ``imagemagick`` through your system's package manager (e.g. ``sudo apt install imagemagick``). See [Preview Generator](https://github.com/algoo/preview-generator) for more detailed instructions. * features: Extract feature vectors from images. ## Development Set up a virtual environment: $ virtualenv env $ source env/bin/activate Install bsie as editable from the git repository: $ git clone https://git.bsfs.io/bsie.git $ cd bsie $ pip install -e .[all] If you want to develop (*dev*), run the tests (*test*), edit the documentation (*doc*), or build a distributable (*build*), install bsfs with the respective extras (in addition to file format extras): $ pip install -e .[dev,doc,build,test] Or, you can manually install the following packages besides BSIE: $ pip install coverage mypy pylint $ pip install rdflib requests types-PyYAML $ pip install sphinx sphinx-copybutton furo $ pip install build To ensure code style discipline, run the following commands: $ coverage run ; coverage html ; xdg-open .htmlcov/index.html $ pylint bsie $ mypy To build the package, do: $ python -m build To run only the tests (without coverage), run the following command from the **test folder**: $ python -m unittest To build the documentation, run the following commands from the **doc folder**: $ sphinx-apidoc -f -o source/api ../bsie/ --module-first -d 1 --separate $ make html $ xdg-open build/html/index.html