3.1.5. Diagram Predicate Framework (DPF)

Now, we arrive indeed at generalized sketches since we will utilize typed graphs instead of just plain graphs as in the CT-example. We are on the same abstraction level as the mFOL-example.

DPF has been developed to describe and relate, in a uniform and precise formal way, a wide variety of diagrammatic modeling techniques in Software Engineering. Each diagrammatic modeling technique, like database schemata, ER diagrams, class diagrams, workflow diagrams, for example, is characterized by a certain footprint. A sketch for such a footprint formalizes then nothing but a single software model. As an example, we outline in this paper a revised and extended version of our diagrammatic Relational Data Model (RM) [18,21].

In Relational Databases, we do have data types and tables with rows and columns. In addition, we can declare different kinds of constraints. A table is identified by a name, and each table has a fixed non-empty set of columns. All columns in a certain table are identified by a unique name; thus, the order of columns is immaterial. It is allowed to use the same column name in different tables. All values in a certain column have to be of the same data type. A table is considered as a set of rows with one cell for each column. In some cells of a table, there may be no values. A row with no values at all is not allowed! Let us declare a table with name *T*, a corresponding set *C* = {*cn*1, ... , *cnm*} of column names and a declaration of a data type name *dnj* for each column name *cnj*.

We represent this declaration by the graph shown above. To define the semantics of table *T*, we first have to fix the semantics of the data type names *dnj* by assigning to each data type name *dnj* a fixed set *Ddnj* of data values. This gives us an *C*-indexed set *D* = (*Ddnj* | *cnj* ∈ *C*) at hand.

Since there may be no values in some of the cells in a row, we generalize the definitions in Section <sup>2</sup> and describe a row **<sup>r</sup>** in table *<sup>T</sup>* as a partial map **<sup>r</sup>** : *<sup>C</sup>* −→◦ *<sup>D</sup>* with **<sup>r</sup>**(*cnj*) ∈ *Ddnj* as long as **<sup>r</sup>**(*cnj*) is defined. We denote by *<sup>p</sup> <sup>j</sup>*∈*<sup>I</sup> Ddnj* , or simply <sup>⊗</sup>*pD*, the set of all those partial maps except the completely undefined map (empty row). For any *cnj* ∈ *C*, we obtain as projection a partial map *<sup>π</sup>cnj* : *<sup>p</sup> <sup>D</sup>* −→◦ *Ddnj* defined for all **<sup>r</sup>** <sup>∈</sup> *<sup>p</sup> <sup>D</sup>* by *πcnj* (**r**) := **r**(*cnj*) if **r**(*cnj*) is defined. These projections turn *<sup>p</sup> D* into a categorical product of the *C*-indexed set *D* = (*Ddnj* | *cnj* ∈ *C*) in the category Par of all sets and partial maps.

Reflecting the idea of a row in a table, we can still utilize the tuple notation, discussed in Section 2, to denote the elements in *<sup>p</sup> D*. We fix a total order *cn*<sup>1</sup> < *cn*<sup>2</sup> < ... < *cnn* on *<sup>C</sup>* and represent a partial map **<sup>r</sup>** : *<sup>C</sup>* −→◦ *<sup>D</sup>* by the tuple (*r*1, ... ,*rn*) with *rj* = **<sup>r</sup>**(*cnj*) if **r**(*cnj*) is defined and *rj* an anonymous indicator " " for *nothing* in all other cases.

The content of table *T* may change. At any point in time, however, the content (semantics) of table *T* is a finite subset of *<sup>p</sup> D* and the semantics of the edges *cnj* are the corresponding restrictions of the projections *πcnj* : *<sup>p</sup> <sup>D</sup>* −→◦ *Ddnj* .

To discuss constraints, let us consider a database schema declaring two data types Int(eger), String and two tables Empl(oyee), Addr(ess) with columns as depicted in the diagram above.

Since a table is a set (!) of rows, we need a mechanism to identify rows uniquely. These are the so-called primary keys (pk). For each table, one of the columns has to be declared as a primary key. In the example, we declare the primary keys *eid* (employee identity) in table *Empl* and *ssn* (social security number) in table *Addr* indicated by underlined names. All values in a primary key have to be distinct and empty cells are not allowed. This means that the corresponding projection has to be injective and total. To require only injectivity, we declare a unique constraint and a not null constraint will enforce a total projection. We

may put both constraints on the column *ssn* in *Empl*. This will, however, not turn *ssn* into a primary key but only into a candidate key. A primary key is the one of the candidate keys we have chosen to serve as a primary key!

To store and retrieve information, the tables in a database have to be somehow connected. To find, for example, the address of an employee, we have to consult Table *Addr*. *Foreign key (fk)* constraints are the mechanism to connect tables. In the example, we declare a foreign key from column *ssn* in *Empl* to column *ssn* in *Addr* indicated by a star *ssn*∗. A column declared as a foreign key may contain empty cells but any value appearing in this column has to also appear in the column the key refers to. This means, especially, that both columns are required to have the same data type!
