|
RNAlib-2.2.5
|
The RNAlib can parse and apply data from constraint definition text files, where each constraint is given as a line of whitespace delimited commands. The syntax we use extends the one used in mfold / UNAfold where each line begins with a command character followed by a set of positions.
Additionally, we introduce several new commands, and allow for an optional loop type context specifier in form of a sequence of characters, and an orientation flag that enables one to force a nucleotide to pair upstream, or downstream.
The following set of commands is recognized:
F
ForceP
ProhibitC
Conflicts/Context dependencyA
Allow (for non-canonical pairs)E
Soft constraints for unpaired position(s), or base pair(s)The optional loop type context specifier [WHERE] may be a combination of the following:
E
Exterior loopH
Hairpin loopI
Interior loop (enclosing pair)i
Interior loop (enclosed pair)M
Multibranch loop (enclosing pair)m
Multibranch loop (enclosed pair)A
All loopsIf no [WHERE] flags are set, all contexts are considered (equivalent to A )
For particular nucleotides that are forced to pair, the following [ORIENTATION] flags may be used:
U
UpstreamD
DownstreamIf no [ORIENTATION] flag is set, both directions are considered.
Sequence positions of nucleotides/base pairs are
based and consist of three positions
,
, and
. Alternativly, four positions may be provided as a pair of two position ranges
, and
using the '-' sign as delimiter within each range, i.e.
, and
.
Below are resulting general cases that are considered valid constraints:
consecutive nucleotides starting at position
to be paired. The optional loop type specifier [WHERE] allows to force them to appear as closing/enclosed pairs of certain types of loops.F i j k [WHERE]
to form. The optional loop type specifier [WHERE] allows to specify in which loop context the base pair must appear.P i 0 k [WHERE]
consecutive nucleotides to participate in base pairing, i.e. make these positions unpaired. The optional loop type specifier [WHERE] allows to force the nucleotides to appear within the loop of specific types.P i j k [WHERE]
to form. The optional loop type specifier [WHERE] allows to specify the type of loop they are disallowed to be the closing or an enclosed pair of.P i-j k-l [WHERE]Description:
to pair with any other nucleotide
. The optional loop type specifier [WHERE] allows to specify the type of loop they are disallowed to be the closing or an enclosed pair of.C i 0 k [WHERE]Description:
[WHERE] flag can be used to enforce specfic loop types the nucleotides must appear in.C i j k
. Two base pairs
and
conflict with each other if
, or
.
, no matter if they are canonical, or non-canonical. In contrast to the above F and W commands, which remove conflicting base pairs, the A command does not. Therefore, it may be used to allow non-canoncial base pair interactions. Since the RNAlib does not contain free energy contributions
for non-canonical base pairs
, they are scored as the maximum of similar, known contributions. In terms of a Nussinov like scoring function the free energy of non-canonical base pairs is therefore estimated as
[WHERE] allows to specify in which loop context the base pair may appear.
to the set of
consecutive nucleotides, starting at position
. The pseudo free energy is applied only if these nucleotides are considered unpaired in the recursions, or evaluations, and is expected to be given in
.
to the set of base pairs
. Energies are expected to be given in
.