aboutsummaryrefslogtreecommitdiffstats
path: root/doc/source/concepts.rst
diff options
context:
space:
mode:
Diffstat (limited to 'doc/source/concepts.rst')
-rw-r--r--doc/source/concepts.rst98
1 files changed, 98 insertions, 0 deletions
diff --git a/doc/source/concepts.rst b/doc/source/concepts.rst
new file mode 100644
index 0000000..9c2ed43
--- /dev/null
+++ b/doc/source/concepts.rst
@@ -0,0 +1,98 @@
+
+Core concepts
+=============
+
+In the following, we present a few core concepts that should help in understanding the BSFS operations and codebase.
+
+
+Graph storage
+-------------
+
+`RDF`_ describes a network or graph like the file graph as a set of
+*(subject, predicate, object)* triples.
+*Subject* is the identifier of the source node,
+*object* is the identifier of the target node (or a literal value),
+and *predicate* is the type of relation between the source node and the target.
+As suggested by `RDF`_, we use URIs to identify nodes and predicates.
+For example, a triple that assigns me as the author of a file could look like this::
+
+ <http://example.com/file#1234> <https://bsfs.io/schema/Entity#author> <http://example.com/me>
+
+Note that alternatively, the *object* could also be a literal value ("me")::
+
+ <http://example.com/file#1234> <https://bsfs.io/schema/Entity#author> "me"
+
+There are a number of graph databases that support this or an analoguous paradigm,
+such as `RDFLib`_, `Blazegraph`_, `TypeDB`_, `Titan`_,
+and `many more <https://en.wikipedia.org/wiki/Graph_database#List_of_graph_databases>`_.
+BSFS uses such a third-party graph database to store its file graph.
+
+As usual in database systems,
+we have to distinguish schema data (that coverns the structure of the storage)
+from instance data (the actual database content).
+Similar to relational database systems,
+both kinds of data can be represented as triples,
+and subsequently stored within the same graph storage
+(although one might need to separate them logically).
+In BSFS, we employ an explicit schema (see next section) that is managed alongside the data.
+
+
+
+Schema
+------
+
+BSFS ensures consistency across multiple distributed client applications
+by maintaining an explicit schema that governs node types and predicates.
+Furthermore, exposing the schema allows client to run a number of compatibility and validity checks
+locally, and a graph database may use the schema to optimize its storage or operations.
+
+In BSFS, the schema is initially provided by the system administrator
+(usually in the `Turtle`_ format)
+and subsequently stored by the backend.
+The default schema defines three root types
+(``bsfs:Node``, ``bsfs:Predicate``, and ``bsfs:Literal``),
+and BSFS expects any node, literal, or predicate to be derived from these roots.
+
+For example, a new predicate can be defined like so::
+
+ # define some abbreviations
+ prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
+ prefix bsfs: <http://schema.bsfs.io/>
+ prefix bse: <http://schema.bsfs.io/Entity#>
+
+ # define a node type
+ bsfs:Entity rdfs:subClassOf bsfs:Node .
+
+ # define a literal type
+ xsd:string rdfs:subClassOf bsfs:Literal .
+
+ # define a predicate ("author of a node")
+ bse:author rdfs:subClassOf bsfs:Predicate ;
+ rdfs:domain bsfs:Entity ;
+ rdfs:range xsd:string .
+
+BSFS checks all requests and rejects queries or operations that violate the schema.
+
+
+Querying
+--------
+
+BSFS at its core is not much more than a translator from a user query into a graph database query.
+It operates directly on three abstract syntax trees (AST),
+to run fetch, search, or sort, queries respectively.
+By not using an existing query language,
+we avoid an unnecessary and possibly expensive parsing step.
+Some routines create an AST internally (e.g., :func:`bsfs.graph.graph.Graph.all`),
+others accept an user-defined AST (e.g., :func:`bsfs.graph.graph.Graph.get`).
+One way or another, the AST is validated against the schema,
+and access control conditions are added.
+
+
+.. _RDF: https://www.w3.org/RDF/
+.. _RDFLib: https://rdflib.readthedocs.io/en/stable/index.html
+.. _Blazegraph: https://blazegraph.com/
+.. _Titan: http://titan.thinkaurelius.com/
+.. _TypeDB: https://vaticle.com/
+.. _Turtle: https://www.w3.org/TR/turtle/
+
+