The Thaumatorium:
Where the magic happens

The Papers of E.F. "The Coddfather" Codd

Edgar Frank “Ted” Codd, also known as “The Coddfather”, or more generally as “The Father of the Relational Model”.

I’ve gathered these papers here, as a) they were very hard to find, and b) there was no central place where all his Relational papers were gathered together. Some places had some papers, other places mentioned others. It was time this man got the respect he deserves and make his famous papers easier to find.

While reading these, do keep in mind that they are old. For example: The first paper writes about using an index to select the needed domains/columns, a concept Codd changed his mind on around 1990 - Codd switched to using “column names” - as he found out some people were creating tables of 100s of columns wide, something not forseen by him when he invented the Relational Model, which would make it completely infeasable to using indices. Ain’t no one going to refer to columns by their index number.

Then also, in his 1988 “Fatal Flaws in SQL” articles, Codd writes about the lack of uniqueness in SQL. This has been mitigated (though not fixed) by the use of DISTINCT (introduced in SQL-86) and using PRIMARY KEYs (introduced in SQL-92). Little things like that make some of the papers a little outdated, even if they’re still not wrong.

Notes

Year Paper/Article/Book IBM Code Starting page Note
(1969) Derivability, Redundancy, and Consistency of Relations Stored in Large Data Banks RJ599 The first actual paper on the Relational Model
(1970) A Relational Model of Data for Large Shared Data Banks The first popular paper on the Relational Model. Defines what a Relation is, what primary, foreign and compound keys are, it defines the first Normal Form, etc.
(1971a) ALPHA: A Data Base Sublanguage Founded on the Relational Calculus ALPHA is Codd’s not-implemented version of a potential query language. He did not want to lord over how a query language should look like, but he did want to give some handles for people to hold on to, which they promptly ignored when they created SQL 😑 - SQL can return things that are not a Relation (a set of tuples), like a table with duplciate rows, or a column, or a single cell…
(1971b) Further Normalization of the Data Base Relational Model RJ909
(1971c) Normalized Data Base Structure: A Brief Tutorial
(1971d) Relational Completeness of Data Base Sublanguages RJ987 He basically proves that the Relational Algebra and Relational Calculus have the same query power as one another, which implies it’s fine to base your query language on either (or both)
(1973a) Understanding Relations (Installment #1) page 7/9 Installment #2 is missing
(1973c) “The GAMMA-0 n-ary Relational Data Base Interface Specifications of Objects and Operations." IBM Research Report RJ1200: (1973)
(1974a) Interactive Support for Non-Programmers: The Relational and Network approaches First collab between E.F. Codd and C.J. Date?
(1974b) Recent Investigations in Relational Data Base Systems
(1974c) “The Relational Approach to Data Base Management: An Overview." Third Annual Texas Conference on Computing Systems (Austin, November 7–8).
(1974d) Seven steps to rendevous with the casual user RJ1333 RENDEVOUS was a system meant to provide users a way to query in straight up English. The system would parse the question, turn it into a query, and execute it. A neat idea that never seems to have left the lab 😢
(1974e) The Relational and Network Approaches: Comparison of the Application Programming Interfaces
(1974f) Understanding Relations (Installment #3) page 3/34
(1974g) Understanding Relations (Installment #4) page 1/3 Installment #5 is missing
(1975a) Data base management
(1975b) Understanding Relations (Installment #6) page 3/6
(1975c) Understanding Relations (Installment #7) page 25/53 This is the last Installment I could find.
(1977) Access to Relational Data Bases for a Casual User The request to work on RENDEZVOUS (see next papers)
(1978) Rendezvous Version 1: An experimental English-Language Query Formulation System for casual users of Relational Data Bases RJ2144
(1979) Extending the database relational model to capture more meaning First paper by The Coddfather where he starts integrating semantics into the Relational Model. This introduces NULL (as an ω), as well as Aggregation and Generalization. First versioned model, named RM/T - Relational Model/Tasmania
(1980) Data Models in Database Management The first mention of “Data Model” - yes, The Coddfather defined “Data Model” as well.
(1981a) “The Significance of the SQL/Data System Announcement." Computerworld 15(7): 27-30(1981)
(1981b) “The Capabilities of Relational Database Management Systems." Proc. Convencio Informatica Latiana (Barcelona, June 9–12).
(1982) Relational database: a practical foundation for productivity The 1981 ACM Turing Award Lecture! Props to the man!
(1985a) Is your DBMS really relational?, original source This and the next article are also known as “How Relational Is Your Database Management System?” - they’re known for the infamous “12 rules”; these rules were created because a LOT of people were claiming to have a relational DB when they really didn’t.
(1985b) Does your DBMS run by the rules?, original source
(1986a) Missing Information (Applicable and Inapplicable) in Relational Databases Here’s a wild idea that still hasn’t been implemented: Instead of using 2-valued (binary) logic (2VL; true and false) or 3-valued (trinary) logic (3VL; true, maybe and false), lets use 4-valued (quaterny) logic (4VL; T for true, F for false, M for missing-but-applicable and I for missing-but-inapplicable). Tony Hoare did nothing wrong. In fact, he did not go far enough! 😤
(1986b) “The Twelve Rules for Relational DBMS." San Jose, The Relational Institute, Technical Report EFC-6.
(1987a) “The Beginning of a New Era in Data Processing” (review of Tandem’s NonStop SQL). InfoWeek, May 4
(1987b) “Fundamental Laws in Database Management." San Jose, The Relational Institute, Technical Report EFC-27
(1987c) More Commentary on Missing Information in Relational Databases (Applicable and Inapplicable Information) We’re doubling down on 4VL, lets gooooooo!
(1987d) “Principles of Design of Database Management Systems." San Jose, The Relational Institute, Technical Report EFC-18
(1987e) “View Updatability in Relational Databases: Algorithm VU-1." Unpublished paper
(1988a) “Domains, Primary Keys, Foreign Keys, and Referential Integrity." Info DB, May.
(1988b) Fatal Flaws in SQL (this HTML version is combined from the two PDFs) (original source, part 1 original source, part 2)
(1990) The Relational Model for Database Management: Version 2 As far as I care, this book (yes, book, not a paper) is the magnum opus on the Relational Model. RM/2 is the numbered successor to RM/T.
(1993) Providing OLAP to User-Analysts: An IT Mandate, original source Codd’s biggest miss, since this was basically a paid ad, using Codd’s name to pull in views for Arbor Software (later Hyperion, aquired by Oracle). Also these new 12 rules feels kinda weak.

Sources

To find these papers, this page was very helpful: dblp computer science bibliography about Edgar F. Codd (DBLP - indirectly ACM), as was this one: Edgar Frank “Ted” Codd > Publications (ACM), as was this one: Technical Paper Archive - Research Reports (IBM)

Bonus