Notation3 (N3) is a language for natively building and reasoning over semantic Knowledge Graphs. Since N3 is a superset of the Resource Description Framework (RDF), it directly operates in the realm of semantic Knowledge Graphs. Hence, there is no impedance mismatch between N3 and Knowledge Graphs, as is the case with imperative programming languages such as JavaScript/TypeScript, Java, or Python.
Building semantic Knowledge Graphs. N3 provides a concise syntax and constructs such as graph terms, which allow attaching metadata to statements (e.g., provenance); and lists, as first-class citizens to describe ordered collections of things.
Machine reasoning on semantic Knowledge Graphs. N3 supports the creation of N3 rules, loosely comparable to If-Then statements, which infer new symbolic knowledge from semantic Knowledge Graphs. This supports a wide range of scenarios, including simply expanding the available knowledge, mapping between different vocabularies and transforming instance data, consistency checking, and implementing decision support systems.
Cross-platform, open-source standard. N3 is an open W3C specification with freely available, open-source, industrial-strength tooling for JavaScript / TypeScript, Java, Python, Prolog, and using command-line tools.
N3 editors provide syntax checking, linting, and machine reasoning (using N3 reasoners):
An N3 document represents an N3 graph in a textual form, which is a series of N3 statements. These are written as triples consisting of a subject, predicate, and object resource. An N3 resource can be represented by any N3 term: an RDF IRI, literal or blank node; or an N3 graph term, list, logical implication, variable or N3 builtin. Comments are indicated using a separate '#' and continue until the end of the line.
The simplest N3 statement or triple is a sequence of a
subject,
predicate, and
object resource,
separated by whitespace and terminated by '.
' after each triple.
In the example below, three N3 triples highlight the enmity between Spiderman and the Green Goblin and lists their human-readable names:
For now, we will asssume that a resource is represented either by an IRI or a literal. In general, one uses an IRI to identify an identifiable entity such as a person, place, or thing — a literal is used for a textual or numerical (i.e., datatype) value, such as name, date, height, and so on.
As shown in the above example, the same subject (here, Spiderman) will often be described by several N3
statements.
To make these N3 statements less cumbersome to write, one can put a semicolon (";
") at the
end of an N3 statement to describe the same subject in the subsequent statement:
Similarly, a predicate (e.g., name) can often list multiple objects for the same subject (e.g., Spiderman).
This can be written by listing the object values separated by a ',
':
An N3 resource in an N3 statement can be represented by any N3 term: an RDF IRI, literal or blank node; or an N3 graph term, list, logical implication, variable or N3 builtin.
An IRI is used to represent an identifiable entity — such as a person, place, or thing.
Until now, we have been writing absolute IRIs (RFC3987) (e.g.,
http://example.org/Spiderman
), which include both the namespace (e.g.,
http://example.org/
) and the local name (e.g., Spiderman
).
It is often much easier to write an IRI as a prefixed name — e.g.,
ex:Spiderman
,
which includes a prefix label (e.g., ex
) as a shorthand for the namespace,
and the local name (e.g., Spiderman
), separated by a colon (":
").
The @prefix
directives associate the prefix label
with a namespace IRI. A prefixed name is turned into an absolute IRI
by concatenating the namespace IRI with the local name.
The following example is equivalent to the original example:
PREFIX
and BASE
directives, as does Turtle,
to align the syntax with SPARQL (see grammar).
These do not have a trailing '.'.
To further simplify prefixed names, one can leave the prefix label empty (e.g., for an often-used namespace):
One can also write relative IRI references, e.g.,
<#Spiderman>
.
A relative IRI reference is resolved, i.e., turned into an absolute
IRI, by concatenating the base IRI with the local name (e.g., Spiderman
).
A base IRI is defined using the @base
directive.
For instance, the following is equivalent to the prior example:
@prefix
/ PREFIX
and @base
/
BASE
declarations
at the top of an N3 document.
This is not mandatory, however,
and they can technically be put anywhere before
the prefixed name or relative IRI
that relies on the declaration.
Subsequent @prefix
/ PREFIX
directives may "re-map"
the same prefix label to another namespace IRI.
leg:3032571
or isbn13:9780136019701
og:video:height
wgs:lat\-long
In many cases, an IRI occurs only once as an object, and is then further described as a subject in
other statements
(e.g., see a prior example).
The example below illustrates how this may result in difficult-to-read
code:
to find descriptions of IRI objects such as tobey-maguire
or willem-dafoe
,
one must scan the full
N3 graph,
as they are only described near the end.
Using the IRI property list syntax, descriptions of these object IRIs can be directly "embedded" within the object position:
Using basic statements: | Using IRI property lists: |
---|---|
IRI property lists build on the predicate object list syntax to group statements about IRIs. One can use objects lists in this syntax as well.
Literals are used to represent a textual or numerical (i.e., datatype) value, such as name,
date, height, and so on. Numbers (integers, decimals, and doubles) are simply represented using their
numerical value, and booleans are represented using true
or false
:
For details on numerical syntaxes (e.g., decimals, doubles), we refer to the RDF 1.1: Turtle (Section 2.5.2) specification.
Other literals, such as strings, but also dates, binary, octal or hex code, XML or JSON code, or other types of numbers (e.g., shorts), need to be written as datatyped literals. These are represented as a string, followed by the `^^` symbol and the corresponding datatype IRI. (`xsd:string` is the datatype IRI for strings; this is the default and can be omitted). For instance:
The literal's lexical form will include
the
characters between
the delimiter quotes (e.g., 2001-08-10
).
The language of the string literal can be indicated using the @
symbol and the corresponding
language tag (as defined in BCP47 —
find the registry in LNG-TAG). For
instance:
The reading direction can be described together with the language BCP47,
using the i18n
namespace:
xsd:string
will
be
assumed. In case a language tag is given, the datatype rdf:langString
will be assumed. Note that
it
is not possible to specify both a datatype IRI and a language tag."8"^^xsd:integer
or "true"^^xsd:boolean).
In case the the string literal itself contains the string delimiter (e.g., double quotes), or includes
newlines, other string delimiters can be used, i.e., single quotes or compound delimiters """
or
'''
.
Alternatively, one can also use a '\
' for escaping the delimiter each time it occurs within a
string literal.
For instance:
The escape symbol '\
' (U+005C) may only appear in a
string literal as part of an escape sequence. Other restrictions within string literals
depend on the delimiter:
'
may not contain the characters
'
, LF
(U+000A), or
CR
(U+000D).
"
may not contain the characters "
, LF
, or CR
.
'''
may not contain the sequence of characters '''
.
"""
may not contain the sequence of characters
"""
.
When describing resources in RDF, you can run into the following situations:
Instead, you can use blank nodes to talk about resources. They are existential variables; that is, they state the existence of a thing without identifying it. Blank nodes can be represented in several ways, as described below. For details, please refer to RDF11-CONCEPTS.
A blank node can be represented by a blank node identifier,
which is unique within the N3 graph, expressed as _:someLabel
.
Then, we use this identifier within our N3 graph to describe the corresponding resource, same as with
an IRI.
For instance, this example shows that the Mona Lisa has an unidentified tree in its background. We don't want to concretely identify this tree, but we do want to describe it — such as the painting it is in, and the type of tree:
In this example, we minted a blank node identifier _:someTree
to represent the resource
that
we want to describe. Note that this identifier is only usable within our local N3 graph —
if you want other N3 graphs to describe it, you should represent it using an IRI.
Using a blank node identifier requires introducing a new identifier for each blank node. In many cases, however, a blank node occurs only once as an object, and is then further described as a subject in other statements, such as in the prior example. In those cases, a more convenient syntax can be used.
The example below shows a more elaborate example:
identifiers _:a
and _:t
are used as object in only one statement,
and are then described as subject in other statements.
Using the blank node property list syntax, these cases can be represented as follows:
Using blank node identifiers: | Using blank node property lists: |
---|---|
As mentioned, this syntax is useful for cases where a blank node occurs only once as an object. In case the blank node occurs several times as an object, such as describing multiple people with the same address, one should use blank node identifiers. For instance:
We can still use the blank node property list syntax for describing the town, as this blank node still only occurs once as object.
Below, we summarize typical use cases where blank nodes are used to describe resources.
Unknown resources: we might want to state that the Mona Lisa painting has in its background an unidentified tree, which we know to be a cypress tree. We could mint an IRI such as "mona-lisa-cypress-tree", but we feel that would be redundant — we simply want to describe the tree, such as the painting it is in, and the type of tree. We're not particularly interested in allowing other N3 graphs to refer to the tree. Moreover, there may already exist an IRI for that particular tree, and we don't want to mint another IRI to refer to the same tree (nor do we want to lookup this existing IRI).
Composite information: when describing composite pieces of information, such as street addresses, telephone numbers and dates, it is often unlikely that anyone outside this N3 graph would need to refer to this address or its pieces. Hence, it would be redundant to mint an IRI just for the purpose of structuring this information. Instead, one can use a blank node to connect the "composed" pieces of information, e.g., the street address, to its composite values, e.g., street name, number, and city, as shown in this example.
N-ary relations: blank nodes are a convenient way to represent n-ary relations in N3.
We often need to describe ordered collections of things, e.g., a book written by several authors,
listing the students in a course, or the software modules within a package. N3 provides a succinct list syntax to represent ordered collections of
resources enclosed by (
)
. The contained resources are called members.
For instance:
This states that the value of the author property is the list resource.
In N3, lists may occur as subjects, predicates or objects in a statement. For instance:
Or even:
It is often useful to attach metadata to groups of triples — to give the provenance, context, or version of the information, our opinion on the matter, and so on. We can use graph terms to quote RDF graphs, and then describe the graph term using N3 statements. For instance:
Essentially, a graph term represents the occurrence of an RDF graph — i.e., a quoting
or
citing of the graph. Importantly, a graph term does not assert the contents of the RDF
graph as being true (e.g., :cervantes dc:wrote :moby_dick
). In fact, the graph term is
interpreted as a resource on its own .
As with lists, graph terms can be used in any position in an N3 statement.
As they represent a quoting of RDF graphs, graph terms are not "referentially transparent". For instance:
This N3 statement states that Lois Lane believes that Superman can fly.
Even if it is known that :Superman
is the same as :ClarkKent
, one cannot infer from
this that Lois Lane believes that :ClarkKent
can fly.
Indeed, this is an accurate depiction of Lois Lane's statement at the time —
as she did not know that Superman is Clark Kent at that point,
she would certainly not be saying that Clark Kent can fly.
N3 supports declarative programming by allowing to make statements about the world, including logical implications, or N3 rules, which can be loosely compared to If-Then statements. Based on the N3 semantics, inferences can be drawn from these statements. An N3 reasoner can draw such inferences automatically, to support problem solving, decision making, or simply enriching your Knowledge Graph.
For instance, the following is an N3 rule:
This is a logical construct, such as a conjunction (AND) and disjunction (OR). Stating such a logical implication means that, in case the rule premise is true (it is raining), the rule conclusion must be true as well (it is cloudy). If this is not the case — raining does not imply that it is cloudy — the implication would be false. However, this is not possible, since all statements in N3 (and RDF) are true by default.
Hence, when the premise holds, then we can infer the conclusion.
This is also referred to as firing the rule.
For the example above, if we additionally state :weather a :Raining
, then we can safely infer
:weater a :Cloudy
:
log:implies
, with symbol =>
as syntactic sugar.
N3 rules are more useful when they include variables.
For instance, the following includes a universal variable (prefixed by ?
):
This states that the N3 rule is true for each value of variable ?x
.
So, when the premise is true for a particular value for ?x
(being a super hero),
then the conclusion must be true for that value as well (being imaginary).
For the example above, the premise is true for value :spiderman
,
meaning we can infer the conclusion for that value, i.e.,
:spiderman a :Imaginary
.
?x a :SuperHero
) can be seen as triple
patterns.
In order to fire the rule, these triple patterns are matched to concrete triples
in the N3 document.
If a match is successful, concrete resources from the matching triple (e.g., :spiderman
)
are bound to the triple pattern's variables (e.g., ?x
).
These bound variable values are then used in the inferred conclusion (e.g.,
:spiderman a :Imaginary
).
N3 rules can be "chained": a rule can depend on the results of other rules. For example:
The third N3 rule relies on the rule conclusion (:locomotion
) of the first two
rules:
if one of those rules fire, i.e., inferring a flying locomotion for a resource,
then the third rule would fire as well, i.e., inferring that the resource
would be a suitable observer for a street with heavy traffic.
This allows for a modularization of N3 code:
the first two rules separately decide when something supports a flying locomotion;
the third rule determines what the general effects of flying locomotion are.
?x
.
As with general coding practices, it is typically a good idea to give variables a
meaningful name.
An N3 reasoner, which can draw inferences from N3 rules,
can operate in forward chaining, backward chaining, or some hybrid mode.
An N3 rule can also indicate which mode should be used,
by using the appropriate predicate.
Up until now, we have been using =>
, which is syntactic sugar for log:implies
and directs the reasoner to operate in forward chaining mode.
The <=
predicate, which is syntactic sugar for log:impliedBy
, indicates a
backward
chaining mode.
For instance, the following is equivalent to the prior example and combines forward and backward chaining:
In a nutshell, forward reasoning presents a bottom-up approach:
starting from a set of initial and inferred statements,
a reasoner will fire any N3 rule where the premise holds,
each time adding the inferences to the set, until no more N3 rules can be fired.
Backward chaining is a top-down approach: given a query, such as ?x :locomotion :flying
,
the reasoner will search for any rules with conclusions that may satisfy the query (e.g., first and
second rule).
Then, it will check whether the premises of those rules hold, which may, in itself,
require searching for rules with matching conclusions.
Regarding general expressivity, these two reasoning modes can be considered equivalent.
A detailed discussion of these two reasoning modes, their subtle differences and impacts on performance,
is beyond the scope of this primer.
An N3 builtin is used within a N3 rule to implement an arbitrary operation on an N3 resource or even an entire N3 document. Builtins include mathematic, string, time and cryptography operators, operations on lists and graphs, and logical operations in general.
You can find a list of builtins here: N3 Builtin Functions.
Below, we give a general overview of the different types of builtins.
As a simple example, the rule below uses the math:quotient
builtin to calculate a person's height
in
meters:
An N3 builtin is used as the predicate in an
N3 builtin statement,
where the subject and object act as input or output arguments.
In the example above, the second rule triple constitutes a builtin statement:
the subject is the input, as math:quotient
calculates the quotient of the subject
list members (i.e., the person's height in cm and 100); the object constitutes the output,
as the result is then bound to the object variable ?m
.
N3 builtins can generate multiple results.
Below, we use the list:iterate
builtin to iterate over list members,
and subsequently constructs a string for each element:
The list:iterate
builtin iterates over all elements in the subject list
(?finalists
).
For each element, it binds its index, and the element itself, to the variables in the object list
(?i
and ?finalist
, respectively).
Hence, the list:iterate
builtin generates multiple sets of bindings for the ?i
and
?finalist
variables:
( ?i = 1, ?finalist = :flash )
( ?i = 2, ?finalist = :superman )
( ?i = 3, ?finalist = :spiderman )
For each set of bindings, the string:concatenation
builtin concatenates the resources in the
subject list, and binds the result to the object variable (?entry
).
This will lead to the following object values:
"1. :flash" , "2. :superman" , and "3. :spiderman"
( :flash :superman :spiderman ) list:iterate ( 1 :flash ) .
( :flash :superman :spiderman ) list:iterate ( 2 :superman ) .
( :flash :superman :spiderman ) list:iterate ( 3 :spiderman ) .
string:concatenation
builtin statement is evaluated.
Finally, for each consistent set of variable bindings, the rule conclusion is instantiated and inferred.
N3 builtins can also operate on a local graph term or even an entire N3 document.
The log:collectAllIn
, log:forAllIn
, and log:notIncludes
builtins are
logical operators that
support a negation-as-failure scoped on the builtin statement object:
the object can be a graph term or a blank node, in which case the current document is the scope.
For instance, the rule below uses log:collectAllIn
to collect all of spiderman's defeated villains
found in the current document:
The log:collectAllIn
builtin accepts as subject a list that includes:
?enemy
),?enemies
).The where clause is evaluated on the given scope, in this case, the current N3 document;
all potential results beyond this scope are considered false (SNAF).
This leads to the following result:
:spiderman :defeatedEnemies ( :green-goblin :doctor-octopus )
The example below uses log:forAllIn
to determine whether a super-hero's identity is safe:
The log:forAllIn
builtin accepts as subject a list with two clauses of a logical
implication.
If, for all cases where the first clause holds, the second clause holds as well,
then the logical implication is true.
In this example, the implication is true when, for each person who knows the identity of spiderman,
this person can keep a secret.
In this case the rule will fire, as both mary-jane-watson
and aunt-may
can keep
secrets.
_:t
indicates the current N3
document
as scope. Alternatively, a graph term can be used as the scope as well.
This is illustrated in the log:notIncludes
example below.
The log:notIncludes
builtin checks whether the given scope does not include a given clause.
The example below uses a graph term as scope, instead of the current N3 document,
as was the case for the prior two examples.
The rule finds any of spiderman's enemies who have not been defeated, at least,
as reported by the Daily Bugle:
For each of spiderman's enemies (?enemy
), the rule checks whether the graph term
?graph
,
as reported by the :daily_bugle
, does not include any statement where the enemy has been defeated.
If so, we infer that this enemy is undefeated, in this case, :sandman
.
log:notIncludes
is log:includes
, which checks whether the given
object scope
includes a given clause.
To dynamically retrieve online knowledge within an N3 rule,
log:semantics
and log:conclusion
builtins allow pulling in, parsing, and reasoning
over logical expressions from online sources.
In the rule below, the log:semantics
builtin retrieves an online source and subsequently parses it
as a graph term:
Subsequently, log:collectAllIn
is used to collect all persons found within the online source.
The log:conclusion
builtin allows generating the closure of a given graph term,
i.e., extending the graph with all applicable rule inferences.
This includes graph terms retrieved and parsed from online sources. For example:
Instead of simply printing the inferences, one could, e.g., use the log:forAllIn
builtin to
check whether all persons listed in the online source are indeed inferred to be animals as well.
Similar to SPARQL property paths, N3 resource paths concisely express paths between resources, when intermediary resources on the path are not relevant. In practice, resource paths are most useful in N3 rules.
A resource path starts from a subject resource, followed by one or more predicates;
each predicate is separated by a directional indicator to follow the predicate either forward
(!
)
or in reverse (^
).
For example, the following example describes the city of Joe's mother's office's address:
In this example, the intermediary resources — Joe's mother, her office, and its address — are not described or referenced further, so the more concise resource path syntax can be used.
The expansion of this shorthand syntax uses blank nodes to express the path between the two resources. The example above is equivalent to the following (using blank node identifiers for clarity):
In other words, each predicate in resource path is expanded into a statement, with as subject either the starting resource, or prior blank node object; and as object a newly minted blank node.
Relations can also be followed in reverse using the reverse (^
) indicator.
The following example, starting from joe
, follows the hasMother
predicate to Joe's
mother,
and then follows the hasMother
predicate in reverse (thus pointing to someone who has the same
mother):
This could equally well be represented by the more verbose:
:joe :hasAddress!:hasCity "Metropolis" .
:joe _:bn_1 "Metropolis" .
:hasAddress :hasCity _:bn_1 .
Finally, the <-
syntax inverts a single property in an N3 statement,
resulting in an inverted property.
For example, the following represents `:joe :hasMother :mary`:
In this section, we present common patterns to solve often-occurring problems, for instance regarding data modeling, in N3.
Until now, we only considered binary relations between entities and/or values. But, many types of relations are ternary, quaternary, or, in general, n-ary in nature, i.e., they have an arbitrary number of participants. Typical examples are purchase, employment or membership relations.
In other cases, we want to describe properties of relations — such as the provenance of a piece of information, or the probability of a diagnosis. But, in essence, this is the same problem as representing n-ary relations.
There are several ways of representing n-ary relations in RDF — these are described in swbp-n-aryRelations.
Below, we illustrate options for representing n-ary relations in N3 in particular.
In general, it is possible to convert any n-ary relation into an equivalent set of binary relations. This is a convenient solution, since we already know how to represent binary relations.
First, we create a resource that represents the n-ary relation, and then use a set of binary relations to link each participant to this newly minted resource. Each binary relation is hereby given a meaningful name that represents the role of the participant in the n-ary relation.
For instance, say we want to describe the Purchase relation between a buyer called "John", a purchased book called "Lenny the Lion", the amount paid for the book, and the seller:
In other cases, things are more naturally described as properties of relations, rather than n-ary relations — for instance, the provenance of a piece of information, the trend of someone's body temperature and when the temperature was taken. Nevertheless, these can be represented in the same way as n-ary relation participants.
We start from the same solution above, i.e., introducing a resource to represent the (in this case, binary) relation, and then linking the two participants to this resource. Subsequently, we use a set of binary relations to attach each descriptive property (e.g., diagnosis probability; temperature trend) to the relation resource.
For instance, when describing someone's (e.g., Christine) current temperature, you may want to indicate the absolute value (e.g., 40 degrees), a description of that value (e.g., elevated), the trend compared to the prior value (e.g., rising), and the time the temperature was taken:
This is possible since we know that the relation resource (e.g., _:to1
) represents the n-ary
relation. Hence, any descriptive properties of the relation, in addition to participants in the relation,
can
simply be attached to the entity.
:Christine
) as
subject, and the relation resource (_:to1
) as object. An alternative would have been to
add
:Christine
as just another element of the n-ary relation, e.g., using a property
temperatureOf
. Our modeling choice here aimed to indicate that Christine is somehow the "owner" of the relationship.
An alternative solution is to use a list to keep all the participants of the n-ary relation. For instance:
A clear advantage of this approach is that it is easier and much less verbose to write down. However, the roles each participant play in the n-ary relation are no longer explicated. This is not a problem when the participants do not have different roles, as in the example above — they all play the same role of group member.
This solution is inspired by a separate discussion within the RDF community on Language Tagged
Strings.
The essence of the discussion is to separate the string, as a simple data, from its various characterizations,
such as reading direction and language.
This design pattern uses the rdf:CompoundLiteral
class together with the
rdf:language
, rdf:direction
, and rdf:value
properties
to respectively describe literal values on the base direction, language, and string value of the subject.
For more, see the "text direction" discussions in the RDF-star working group.
Graph terms allow attaching metadata to groups of triples, such as the provenance, context, version, opinion, probability, etc. See the example below:
Importantly, as discussed for graph terms in general, a graph term does not assert the contents of the RDF graph as being true. This allows expressing examples such as above, where the statement within the graph term should not be asserted.