Skip to content

MARK Entities

MARK rules refer to entities - abstract objects that wrap the real object classes of the analyzed programming language (in case of object-oriented languages) and group methods with similar semantics in so-called op​s. A MARK entity defines three parts:

  1. A name
  2. A set of op​s (operations)
  3. A set of MARK variables

Writing MARK rules for a library requires a good understanding of the library API and its class hierarchy. We recommend the following approach to writing MARK entities.

  1. Model relevant classes as MARK entities
  2. Define op​s and variables
  3. (Optionally) blacklist some op​s

Model relevant classes as MARK entities

It is certainly not necessary to model all classes of the software library as MARK entities. Rather, you need to identify those classes which hold relevant data or provide functions. Although in many cases, several classes of the programming language can be summarized in one abstract MARK entity, it might be easier to directly map classes to entities in the first iteration.

The name of an entity can be freely chosen. If it refers to a specific class in the programming language, though, it might make sense to name them accordingly.

Define Ops and variables

The next step is to define op​s. An op is a group of semantically equal or similar functions, methods, or constructors, given as fully qualified signatures. Especially overloaded functions with the same name but different parameters are candidates for being grouped in an op. For cryptographic libraries, typical op​s are:

  • instantiate - a group of functions for instantiating a class or creating an object of a class
  • initialize - a group of functions that initialize a cryptographic algorithms, e.g. by setting a key or initialization vector
  • update - a group of functions that process further data by a cryptographic algorithm
  • finalize- a group of functions that terminate a cryptographic algorithm
  • reset- a group of functions that reset a cryptographic algorithm and make it ready for further processing.

The name of an op can be freely chosen. When specifying fully qualified function or method signatures in an op, parameters are typed and can be unnamed or named. Unnamed parameters are indicated by the name "_" and do not play any role in the definition of rules. Named parameters refer to MARK variables and can be used when writing rules. We recommend to name only those parameters which are required in rules, as named parameters will increase the memory cost and runtime of the analysis.

Example: The following op instantiate refers to only a single Java method, called de.example.Crypto.getInstance. Neither the return type, nor modifiers such as public, static, final etc. are given in MARK. The method signature contains one named parameter of type java.lang.String and one unnamed parameter without type restriction. Note that the name of the parameter does not relate to the parameter name in the programming language, but rather to a MARK variable.

op instantiate {
  de.example.Crypto.getInstance(
    algorithm : java.lang.String,    // Named typed parameter
    _                                // Unnamed untyped parameter
  );
}

So, this MARK op would include the following methods of a class de.example.Crypto:

  • public static Crypto getInstance(String x, String y)
  • private void getInstance(String x, byte y)

It would however not include a method void getInstance(String x) (wrong number of parameters) or getInstance(byte x, String y) (wrong type of 1st parameter).

To make use of named parameters, they must additionally be declared as entity variables using the var keyword:

entity Crypto {

  var algorithm;  // this makes parameter "algorithm" available when writing rules.

  op instantiate {
    de.example.Crypto.getInstance(
      algorithm : java.lang.String,    // Named typed parameter
      _                                // Unnamed untyped parameter
    );
  }
}

(Optionally) blacklist some Ops

In some cases, groups of functions or methods should not be used at all by a program. This applies e.g. to deprecated functions or functions that are known to be insecure. MARK provides a shortcut to mark any use of such functions as insecure: the forbidden keyword.

entity Crypto {

  op instantiate {
    de.example.Crypto.getInstance(
      algorithm : java.lang.String,    // Named typed parameter
      _                                // Unnamed untyped parameter
    );
    forbidden de.example.Crypto.getInstanceDeprecated();  // Any use of this function will be flagged
  }
}

Any occurrence of getInstanceDeprecated() in the program will be marked as insecure, without further evaluation of rules. This is not only a shortcut, removing the need to write separate rules, but also a way to cut down analysis time, as Codyze does not need to find instances of the entity, but will rather indicate the error immediately when it sees a usage of the method.

Complete Example

entity org.bouncycastle.crypto.digests.SHA512Digest 
           isa org.bouncycastle.crypto.Digest {

  var instance : org.bouncycastle.crypto.digests.SHA512Digest; // Alternative for `this`

  var encodedState : byte[];
  var copy : org.bouncycastle.crypto.digests.SHA512Digest;

  var inByte : byte;
  var inByteArray : byte[];
  var inByteArrayOff : int;
  var inByteArraylen : int;

  var outArray : byte[];
  var outArrayOff : int;
  var resultLen : int;


  op instantiate {
        this = org.bouncycastle.crypto.digests.SHA512Digest();
        this = org.bouncycastle.crypto.digests.SHA512Digest(encodedState);
        instance = org.bouncycastle.crypto.digests.SHA512Digest(copy);
    }

  op update {
    org.bouncycastle.crypto.digests.SHA512Digest.update(inByte);
    org.bouncycastle.crypto.digests.SHA512Digest.update(inByteArray, inByteArrayOff, inByteArrayLen);
  }

  op finish {
    org.bouncycastle.crypto.digests.SHA512Digest.finish();
  }

  op finalize {
    resultLen = org.bouncycastle.crypto.digests.SHA512Digest.doFinal(outArray, outArrayOff);
  }

  op reset {
    org.bouncycastle.crypto.digests.SHA512Digest.reset();
    org.bouncycastle.crypto.digests.SHA512Digest.reset(_);
  }

}

Last update: 2022-08-23
Created: 2020-01-16