Pufka - Databases

databases
#define databases: \
I--------------------------------------------------------------------------------------------------\
I--------------------------------------------------------------------------------------------------\
I--------------------------------------------------------------------------------------------------\
I     /$$$$$$$              /$$               /$$                                                  \
I    | $$__  $$            | $$              | $$                                                  \
I    | $$  \ $$  /$$$$$$  /$$$$$$    /$$$$$$ | $$$$$$$   /$$$$$$   /$$$$$$$  /$$$$$$   /$$$$$$$    \
I    | $$  | $$ |____  $$|_  $$_/   |____  $$| $$__  $$ |____  $$ /$$_____/ /$$__  $$ /$$_____/    \
I    | $$  | $$  /$$$$$$$  | $$      /$$$$$$$| $$  \ $$  /$$$$$$$|  $$$$$$ | $$$$$$$$|  $$$$$$     \
I    | $$  | $$ /$$__  $$  | $$ /$$ /$$__  $$| $$  | $$ /$$__  $$ \____  $$| $$_____/ \____  $$    \
I    | $$$$$$$/|  $$$$$$$  |  $$$$/|  $$$$$$$| $$$$$$$/|  $$$$$$$ /$$$$$$$/|  $$$$$$$ /$$$$$$$/    \
I    |_______/  \_______/   \___/   \_______/|_______/  \_______/|_______/  \_______/|_______/     \
I--------------------------------------------------------------------------------------------------\
I--------------------------------------------------------------------------------------------------\
I--------------------------------------------------------------------------------------------------\
I--------------------------------------------------------------------------------------------------I

    • see more AT "/SQL"
    • most database engines have adopted a client-server architecture,
      which is the right call in business settings,
      however databases are common in standalone desktop applications too
      where it makes very little sense;
      please question what you are doing before shooting yourself in the foot
      without ever even realizing there is a less painful way;
      as of 2025 your only option is sqlite

    ○ types
        • tree {filesystems}
        • graph
        • relational {sql}
        • nosql (non-relational) {JSON databases}

    •  most things here implicitly apply to relational databases;
            because that's by far the most widely used database model

    Table:    Table:
        • structure of arrays
        • a predefined list of attributes and entries
        • table names are usually placed to plural {"users", "tickets"}
        • technically just a bunch of struct-s (with some implementation details),
          but its generally better to imagine as a vertical list
        • a record is an entry inside a table
        • using a terminology for those familiar with OOP,
          a table is like a class,
          a record is like an instance
        {
                        Table "users"
            | ID |   Name    |       Email        |
            +----+-----------+--------------------+
            •  0    emil        admin@chud.cyou
            •  1    xolatile    xolatile@chud.cyou
        }

    Dependency:    Dependency:
        • when a value can be inferred from another value
        • generally a sign of poor database design,
          however rarely, its deliberate to address performance concerns
        {
            // Dependency special case example // ?!; maybe move
            /* usually emails are stored as simple strings.
             * this is normalized to the degree that such is generally
             * not considered a 1NF violation.
             * however one could have a use-case for storing them as such:
             */
            | user | domain | top_level_domain |
            •  anon     chud               cyou
            /* but we must have a wholistic view of our operations:
             *     >querying based on components is rare
             *     >fetching whole emails is common
             * keep in mind that to yield email addresses we will have to
             *  reconstruct them at some point
             * therefor, as a result of our database architecture plan,
             *  common operations would take a performance hit
             * the first thing to ask is "do i really need to store email components?"
             * and if the result is a confident "yes, absolutely",
             * then the following is your solution, even tho generally you should NOT do it:
             */
             |     email     | user | domain | top_level_domain |
             •  anon@chud.cyou anon     chud               cyou
        }
        Functional:
            • 2 attribute's values have a one-to-one correspondence
            • "egyértelmű hozzárendelés"^HU
        Transitive:
            • when a dependency has a dependency
            • indirect functional dependency
            { A->B and B->C
              therefor A->C
            }

    Relationships:
        • a record expressing belonging to another record
          (which could be from a different table)
        • some pointer will have to be stored,
          this could be simply an ID number
          or a managed pointer (called a foreign key),
          depending on the database and application in question
        one_to_one:
            • one record belongs to exactly one other record
            /* Assuming every user has to register with an email,
 *  and one email can be used only for one registration.
 */
    Table "users"                Table "emails"
| Username | Email |       | ID | Name |   Domain   |
+----------+-------+       +----+------+------------+
> emil        1      ────▶   1    admin    chud.cyou

        one_to_many:
            • the way to express lists purely with relationships
            • multiple records will be pointing to the same record
             Table "comments"          Table "posts"          
| ID | Text | Post |       | ID | Text |          
> 1    a...   1      ──┬─▶ > 1    ...             
> 2    b...   1      ──┤
> 3    c...   1      ──┘

        many_to_many:
            • lists belong to lists
            • a pivot table is a table expressing only relationships
            • a pivot table is required to express many-to-many relationships
              in relational databases
            • pivot tables are usually not displayed when creating database graphs
            • the name of a pivot table is usually the combination of the tables it joins
            "students"     "studentsCourses"        "courses"
   | ID |      | student | course |     | ID |
   > 1                 1   1            > 1
   > 2                 1   2            > 2
   > 3                 2   3            > 3
                       3   3
/* The arrows are omitted to prevent confusion.
 * The way this can be read is:
 *  > student one took course one and two
 *  > course three was taken by student two and three
 * Notice how the order it is put is irrelevant,
 *  there is no inherent one way or other.
 */

        zero_or_one_to_one:
            • "0/1-1"
            • pattern
            • null-ed foreign key dependent on another field


    Key:    Key:
        • a key is a set of attributes by which a record is uniquely identifiable
        • one or more columns of the original data set or an added id (num) row
        Superkey:
            • the set of attribute sets which can uniquely identify a record
            • cannot contain attributes with possible NULL values
            • the element of the super-key consisting of the most attributes
              is called the trivial superkey
            • all attributes excluded from the super-key form a candidate key
        Candidate_key:
            • all elements of the super-key which have no real subsets in the super-key
        Primary_key:
            •  conventionally, when a table is displayed,
                    the primary key is always the first column
            • many modern database engines always add an implicit primary if one was not specified
            • a freely chosen candidate key or a newly added id row which
              represents a freely chosen candidate key
            • a natural key is a candidate key chosen to be the primary key
            • an artifical key is a newly given, enum-erating primary key
            — adding an id row is prefered when:
                • the chosen candidate key consists of multiple attributes
                • if the candidate key is an attribute holding complex data
                 (long strings which are relatively slow to  read)
            • id rows are most often incremental, ie. each new elements
              id is the previous ones + 1


    View:
        • virtual table
        — simplifies things:
            • queries become shorter
            • joins become transparent
        • allows for row oriented security


    Normal_forms:    Normal_forms:
        0.:
            • anything that is not in 1st normal form
        1.:
            • primary keys are present
            • every field is atomic ie. at no point will have to be split into multiple
              parts by a user
            • normalizing a data structure from 0th normal form strictly
              to 1st normal form doesnt result in any new tables
              being created
            // People table; we know that parts of the address will be handled independently
   Name | Age | Address                                  |
   Anon |  18 | 1204 Budapest, Nagysándor József utca 93 |
   [...]
 // Therefor we organize the individual parts into different columns
   Name | Age | Post code | City     | Address                | Address number |
   Anon |  18 |      1204 | Budapest | Nagysándor József utca |             93 |
   [...]

        2.:
            • auto when the primary key is a single attribute
            • functional dependencies of the primary key are eliminated
        3.:
            • all candidate key related functional dependencies are eliminated
        3.5. (Boyce-Codd):
            • all functional dependencies are eliminated
            • a table in 3 normal form without multiple overlapping candidate
              keys is always also in 3.5. normal form
            • what you should generally aim for
        4.:
            pass
        5.:
            pass



    Object_diagramm:    Object_diagramm:
        Attribute:
            // its name and optionally its type in a ellipse
              .--------------.
             /                \
            (      [name]      )
             \                /
              ^--------------^
        Multi_value_attribute:
            • list/set/etc.
            // attribute with an additional ellipse
              .--------------.
             / .------------. \
            ( (    [name]    ) )
             \ ^------------^ /
              ^--------------^
        Key:
            // attribute with the inner text underlined
              .--------------.
             /                \
            (     đ[name]đ     )
             \                /
              ^--------------^
        Partial_key:
            // attribute with the inner text non-continuously underlined
              .--------------.
             /                \
            (      [name]      )
             \     ‾ ‾ ‾ ‾    /
              ^--------------^
        struct:
            • attribute made up from multiple attributes
            // 2 or more attributes connected with a continuous line
              .--------------.              .--------------.
             /                \            /                \
            (      [att1]      )──────────(      [att2]      )
             \                /            \                /
              ^--------------^              ^--------------^
        Entity:
            • connects attributes
            // rectangle
              .--------------.            +----------------+            .--------------.
             /                \           |                |           /                \
            (      [att1]      )──────────|     [name]     |──────────(      [att2]      )
             \                /           |                |           \                /
              ^--------------^            +----------------+            ^--------------^
                                                  │
                                                  │
                                           .--------------.
                                          /                \
                                         (      [att2]      )
                                          \                /
                                           ^--------------^

        Relation:
            • always connected to atleast 2 structs with continuous lines
            • the num of attributes in a relation is called its degree
            // rhombus
                                                 _-^-_                  
              .--------------.                _-^     ^-_                .--------------.
             /                \            _-^           ^-_            /                \
            (      [att1]      )──────────<     [name]      >──────────(      [att2]      )
             \                /            ^-_           _-^            \                /
              ^--------------^                ^-_     _-^                ^--------------^
                                                 ^-_-^                  
                              


    Vertical_database:    Vertical_database:
        • used for storing highly heterogeneous and or dynamic data
        • common is scientific research and electronic patient record systems
        — where you want a database, but have little to no clue about the schema:
            • the data-safety is a must
            • you wish to have external support from a big company
            • new things keep popping up
        EAV:        EAV:
            
                https://www.cedanet.com.au/antipatterns/eav.php
                https://www.red-gate.com/simple-talk/opinion/opinion-pieces/bad-carma/
            "\
            In other words, EAV gives you enough rope to hang yourself and  \
            in this industry, things should be designed to the lowest level \
            of complexity because the guy replacing you on the project will \
            likely be an idiot.                                             \
            "
            • "Entity-Attribute-Value"/"Object-Attribute-Value"
            • "implementing a database inside a database"
            • "column to row transformation" (called pivoting)
            • pattern
            • vertical databases implemented in a relational database system
            • widely frowned upon
            • no FOSS implementations at all
            • under EAV most queries will require procedural code
            • an abstraction layer must be provided to hide the complexity
            • a query generator will have to be implemented
            • comes up when the requirements are inadequate to create an application
              (due to poor communication or otherwise)
            
                • is a relational database engine
                • adding new types shifts into an administrative responsibility
            
                • hard to maintain
                • hard to comprehend
                • slow
                • requires much work up-front
                • administrators are usually not very good at adding new types
                — by-passes many builtin relational database features:
                    • constraints
                    • disk optimization
                    • query optimization
                    • views
                    • caching
                    • time-sharing
                    • trigger based validation
            entity:
                • all entities are stored in the same table
                • the non-descriptive name "entity" is used on purpose,
                  because it marks anything from living beings
                  to experiment findings
                • all entities reference a metadata (see BELOW)
                • multiple entity entries form a single coherent record
                CREATE TABLE entity (
  id INT PRIMARY KEY,
  attibute INT FOREIGN KEY(metadata),
  value string
);

                • optionally the entity could be a foreign key itself,
                  referencing a conventional table
                  {there is a users table with their rudimentary personal data,
                    but their other "belongings" are stored in EAV
                  }
            attribute:
                • metadata
                • stores the type of an entity
                • no standard way to implement it
                — rough approximation of columns:
                    • validation {type; range; regex}
                    • GUI presentation
                    • normal value (in a clinical sense)
                    • grouping (parent/sibling relationship for something like a form)
                    • description
                    • keywords (for searching)
            value:
                • the concrete data
                • unlike the rest, it would be stored in a field normally too
                • the simplest implementation is to use a string
                • could come with an extra field,
                  which marks what table the value is a foreign key to;
                  ie. you could have a value table for each SQL datatype,
                  and the value column reference one of them
            EAV_CR:
                
                    PMID: 10579606
                • "EAV with Classes and Relationships"
                • the OOP of EAV
                • allows for nesting
                • the Value becomes the foreign key of a class table

sql
#define sql:: \
I------------------------------------------------------\
I------------------------------------------------------\
I------------------------------------------------------\
I              /$$$$$$   /$$$$$$  /$$                  \
I             /$$__  $$ /$$__  $$| $$                  \
I            | $$  \__/| $$  \ $$| $$                  \
I            |  $$$$$$ | $$  | $$| $$                  \
I             \____  $$| $$  | $$| $$                  \
I             /$$  \ $$| $$/$$ $$| $$                  \
I            |  $$$$$$/|  $$$$$$/| $$$$$$$$            \
I             \______/  \____ $$$|________/            \
I                            \__/                      \
I------------------------------------------------------\
I------------------------------------------------------\
I------------------------------------------------------I

    • "Structured Query Language"
    • database management language
    • initially developed by IBM
    • ANSI standard
    • attempted to be usable by businessmen, hence the aspiration to resemble natural language;
      failed miserably as such aspirations repeatedly do
    • structurally and logically good, the language semantics are highly questionable,
      but well legible given practice
    •  SQL is split into multiple "component" languages;
            these are very similar in logic and syntax,
            but are concerned with different aspects of database management;
            each is detailed seperately

    Example_Data:    Example_Data:
        • the following describes an insane asylum
        • in this hypothetical place patients dont age after entry
        +====================================================================+
        |                               Patient                              |
        +====================================================================+
        | name           | biological_age | room | entered    | illness_type |
        +====================================================================+
        | Alice          | 17             | 2    | 1866-07-01 | 10           |
        | Kaffka Tibor   | 41             | 18   | 2020-04-05 | 6            |
        | Mia            | 21             | 2    | 2132-02-17 | 2            |
        | Mumus          | 45             | 2    | 2120-09-30 | 2            |
        | Sam Lowry      | 38             | 22   | 1985-08-30 | 2            |
        | Sitka          | 19             | 24   | 2019-09-04 | 7            |
        | Tinta Vince    | 16             | 23   | 2006-01-06 | 1            |
        | Tyler Durden   | 24             | 2    | 1999-10-15 | 9            |
        | William Dickes | 43             | 11   | 1884-04-04 | 3            |
        +----------------+----------------+------+------------+--------------+
        +===============================================================+
        I                       Illness States                          I
        +===============================================================+
        I psychopaty | delusion | obsession | ADHD | schizophrenia | id I
        +===============================================================+
        | 1          | 0        | 0         | 0    | 0             | 1  |
        | 0          | 1        | 0         | 0    | 0             | 2  |
        | 0          | 0        | 1         | 0    | 0             | 3  |
        | 0          | 0        | 0         | 1    | 0             | 4  |
        | 0          | 0        | 0         | 0    | 1             | 5  |
        | 1          | 1        | 0         | 0    | 0             | 6  |
        | 1          | 0        | 0         | 1    | 0             | 7  |
        | 1          | 1        | 0         | 1    | 0             | 8  |
        | 1          | 0        | 1         | 0    | 1             | 9  |
        | 0          | 1        | 0         | 0    | 1             | 10 |
        +------------+----------+-----------+------+---------------+----+
        +====================+
        |        Room        |
        +====================+
        | num | floor | beds |
        +====================+
        | 2   | 1     | 4    |
        | 11  | 2     | 3    |
        | 18  | 2     | 2    |
        | 22  | 2     | 1    |
        | 23  | 2     | 1    |
        | 24  | 2     | 1    |
        | 25  | 2     | 1    |
        +-----+-------+------+
        ○ you may reproduce it with the following (in MySQL)
          { CREATE DATABASE Asylum;
CREATE TABLE patient (name VARCHAR(128) DEFAULT "N/A" NOT NULL,
                        biological_age INT NOT NULL,
                        room INT,
                        entered DATE DEFAULT CURRENT_TIMESTAMP NOT NULL,
                        illness_type INT NOT NULL);
CREATE TABLE room (num INT not NULL,
                        floor INT DEFAULT 1 NOT NULL,
                        beds INT DEFAULT 1);
CREATE TABLE illness_type (psychopaty BOOLEAN DEFAULT FALSE,
                        delusion BOOLEAN DEFAULT FALSE,
                        obsession BOOLEAN DEFAULT FALSE,
                        ADHD BOOLEAN DEFAULT FALSE,
                        schizophrenia BOOLEAN DEFAULT FALSE);
ALTER TABLE patient ADD PRIMARY KEY(name);
ALTER TABLE room ADD PRIMARY KEY(num);
ALTER TABLE illness_type ADD id INT AUTO_INCREMENT PRIMARY KEY;
INSERT INTO room (num, floor, beds) VALUES (2, 1, 4),
                                            (11, 2, 3),
                                            (18, 2, 2),
                                            (22, 2, 1),
                                            (23, 2, 1),
                                            (24, 2, 1),
                                            (25, 2, 1);
INSERT INTO illness_type (psychopaty, delusion, obsession, ADHD, schizophrenia)
        VALUES (TRUE, FALSE, FALSE, FALSE, FALSE),
                (FALSE, TRUE, FALSE, FALSE, FALSE),
                (FALSE, FALSE, TRUE, FALSE, FALSE),
                (FALSE, FALSE, FALSE, TRUE, FALSE),
                (FALSE, FALSE, FALSE, FALSE, TRUE),
                (TRUE, TRUE, FALSE, FALSE, FALSE),
                (TRUE, FALSE, FALSE, TRUE, FALSE),
                (TRUE, TRUE, FALSE, TRUE, FALSE),
                (TRUE, FALSE, TRUE, FALSE, TRUE),
                (FALSE, TRUE, FALSE, FALSE, TRUE);
INSERT INTO patient (name, biological_age, room, entered, illness_type)
            VALUES ("William Dickes", 43, 11, DATE("1884-04-04"), 3),
                    ("Tinta Vince", 16, 23, DATE("2006-01-06"), 1),
                    ("Mumus", 45, 2, DATE("2120-09-30"), 2),
                    ("Kaffka Tibor", 41, 18, DATE("2020-04-05"), 6),
                    ("Alice", 17, 2, DATE("1866-07-01"), 10),
                    ("Sitka", 19, 24, DATE("2019-09-04"), 7),
                    ("Tyler Durden", 24, 2, DATE("1999-10-15"), 9),
                    ("Mia",21, 2, DATE("2132-02-17"), 2),
                    ("Sam Lowry", 38, 22, DATE("1985-08-30"), 2);
ALTER TABLE patient ADD CONSTRAINT FOREIGN KEY (illness_type) REFERENCES illness_type(id);
ALTER TABLE patient ADD CONSTRAINT FOREIGN KEY (room) REFERENCES room(num);


    PROGRAMS:    PROGRAMS:
----------------------
        sqlite:        sqlite:
----------------------

    • distributed model
    • reads/writes directly to memory
    • can use (single) files, runtime memory or be embedded into the executable

    sqlite3 [options] [FILE] [sql]  : interactive sqlite shell

    Common_commands:
        • these are special directives for the sqlite shell
        .shell [...]        : access the system shell
        .schema [table]     : get the CREATE command printed for a table
        .exit
        .quit

    
        "C/3th party libraries/sqlite3"

--------------------------
        postgresql:        postgresql:
--------------------------

    • server-client model

    psql [options] -U <username> -d <database>  : interactive postgresql shell

    Commands:
        \d <table>  : describe <table>

---------------------
        mysql:        mysql:
---------------------

    • server-client model

    
        • running mariadb
        mysql [...]

    Set_up: 
.       ### Running Mariadb ###
            1. Install the mariadb package
            2. Install mariadb      // necessery only once on a given system
                mariadb-install-db --user=mysql --basedir=/usr --datadir=/var/lib/mysql
            3. Run the daemon
                systemctl start mariadb
        #
        
        ### Using MySQL as a normal user ###    
            • since MySQL uses an internal system for identifying users a few hoops must be jumped
            • root is always preregistered
            1. Log in as root
                $ sudo mysql
            2. Create a new user
                CREATE USER '<your_name>'@'localhost' IDENTIFIED BY '<password>';
                •  <password> doesnt have to be the same as the user's system login password
            3. Grant privileges to the new user
                GRANT ALL PRIVILEGES ON *.* TO '<your_name>'@'localhost';
                FLUSH PRIVILEGES;
                •  this grants all privileges, which may not be a very bright idea on a shared system;
                        granting partial privileges is recommended and detailed BELOW
            4. Exit
                exit
            5. Log in as yourself
                mysql --user=<your_username> -p
                —  the following alias is recommended (see AT "/Bash/Builtins/alias")
                    alias mysql="mysql --user=${USER} -p"
        #

    mysql [options]     : starts (interactive) sql client
        --host=[domain]     : sets host
        --port=<int>        : sets port
        --user=<string>     : specifies username to use
        --password=<string> : specifies pass-word to use
        — p                  : prompt for pass-word; recommended over "--password" (see ABOVE)
        
            GNU Readline library    : used for all input operations (see AT "C++/Readline\/History")
        Files:
            .mysql_history  : contains the list of previously interactively entered lines
        Environment_variabes:
            ${EDITOR}       : read by the "edit" command (see BELOW)
            ${MYSQL_PS1}    : prompt string 1; the string which is displayed each time before the interpreter allows for input
        Commands:
            edit || \e      : opens a temporary file with the users set ${EDITOR}; uppon reaching the
                               next ';' the text from this temporary file is run as a script
            source [file]   : run [file] as a script
            use [database]  : selects [database] as the database to operate on
            exit            : terminate normally


    mycli [options] : mysql wrapper with auto completion and syntax highlighting
         flag -p is a pitfall, see BELOW
        — p <string>     : pass authentication password as <string>;
                           if ommited, securely asking on stdin is the default
        — u <string>     : pass user to use as <string>; the default is mysql

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        Other_implementations:        Other_implementations:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+---------------------------+---------------------------+
|      Parent Company       |          Database         |
+---------------------------+---------------------------+
| Microsoft Corporation     | Microsoft SQL Server      |
| Oracle Corporation        | Oracle Database           |
| IBM                       | IBM Db2                   |
| Amazon Web Services (AWS) | Amazon Aurora             |
| Google                    | Google Cloud Spanner      |
+---------------------------+---------------------------+

//------------------------------------
//------------------------------------

    FILES:    FILES:
        .sql        : SQL script
        .pls        : PL/SQL script
        .pks        : SQL package
        .pkb        : SQL package body
        .tab        : SQL script for table creation
        .tbl        : SQL script for table creation

    COMMENTS:    COMMENTS:
        -- <string>     : single line comment

    SEMICOLONS:    SEMICOLONS:
        • all sql statements are terminated by a single semicolon ';'
        • sql statements are not run until a semicolon is encountered
        • interpreter commands (see ABOVE) however dont take semicolons {exit}

    typenames:    typenames:
        Logical:
            BIT
            BOOLEAN
        Numeric:
            Integer:
                +-----------+---------+--------------------+----------------+
                | Name      | Storage |    Signed Range    | Unsigned Range |
                +-----------+---------+--------------------+----------------+
                | TINYINT   |       1 |          -128, 127 |         0, 255 |
                | SMALLINT  |       2 |      -32768, 32767 |       0, 65535 |
                | MEDIUMINT |       3 |  -8388608, 8388607 |    0, 16777215 |
                | INTEGER   |       4 |      -2^31, 2^31-1 |      0, 2^32-1 |
                | BIGINT    |       8 |      -2^63, 2^63-1 |      0, 2^64-1 |
                +-----------+---------+--------------------+----------------+
            DECIMAL
            NUMERIC
            FLOAT
            REAL
            DOUBLE PERCISION
        Time:
            DATE
            TIME
            DATETIME
            TIMESTAMP
            YEAR
            — zero values:
                +------------+---------------------+
                | Data Type  |    “Zero” Value     |
                +------------+---------------------+
                | DATE       |          0000-00-00 |
                | TIME       |            00:00:00 |
                | DATETIME   | 0000-00-00 00:00:00 |
                | TIMESTAMP  | 0000-00-00 00:00:00 |
                | YEAR       |                0000 |
                +------------+---------------------+
        String:
            CHAR(<int>)         : <int> long char array; uninitialized indexes are filled in with a specific value (usually spaces);
                                   used when the data is known to have a consistent length
            VARCHAR(<int>)      : "VARiable CHAR"; <int> long char array; uninitialized indexes are ignored, however preserved regardless
            BINARY
            VARBINARY
            BLOB
            TEXT
            ENUM
            SET 

    OPERATORS:    OPERATORS:
        General:
        Logical:
            [column] = [value]          : equals;  SQL uses a single '=' instead of 2 for
                                           comparisons oppose to most computer languages;
                                           however its only logical as it doesnt have an assignment operator
            [column] != [value]         : not equals
            [column] > [value]          : greater than
            [column] < [value]          : lesser than
            [column] >= [value]         : greater than or equals to
            [column] <= [value]         : lesser than or equals to
            [column] BETWEEN [value-1] AND [value-2]
            NOT [bool]
            [bool-1] AND [bool-2]
            [bool-1] OR [bool-2]
            [string-1] LIKE [string-2] : regex match
            [regex]:
                %   : match any number of chars
                _   : match exactly one char
        Numeric:
            — 
            +
            *
            /
            %   MOD
        String:
            ||      : concatenation; support is disabled by default; XXX: SET sql_mode='PIPES_AS_CONCAT';
        IN:
            pass
        ○ examples:
            -- LIKE; select patients with 'y' char in their name;
--    NOTE: normally either LOWER() or UPPER would be
--    called for humanly logical results,
--    however the example tries to keep it simple
> SELECT p.name FROM patient p WHERE p.name LIKE '%y%';
-- -----------
-- --- IN ---
> DELETE FROM patient WHERE biological_age IN (SELECT biological_age FROM patient p WHERE p.biological_age > 40);


    DDL:    DDL:
        • "Data Definition Language"
        CREATE [object]         : creates [object]
            DATABASE <string>                           : database named <string>
            TABLE <string> ([decl-1(, [decl-num])*])    : table named <string> with columns defined by [decl-uint]
                [decl]:
                    <string> <typename>( DEFAULT [value]( NOT NULL))
                    [value]:
                        <int>
                        "<string>"
                        CURRENT_TIMESTAMP

    DML:    DML:
        • "Data Manipulation Language"
        DELETE FROM [table] ([where])   :  not specifying [where] will select (ie. delete) all rows
        DROP [object] <string>
        INSERT INTO [table] ([columns]+) VALUES ([value]+)
        UPDATE [table] SET ([column] = [value])+ ([where])
        ALTER TABLE <string> [cmd]
            ADD [?!]
                [column]
                PRIMARY KEY([column name])
                CONSTRAINT [constraint]
                    [constraint]
                        <string> [type]
                            [type]
                                [column_type] [column]+
                                    PRIMARY KEY
                                    UNIQUE
                                [check_type]
                                    CHECK([logic])
                        FOREIGN KEY [column] REFERENCES [column] [on_delete]
                            [on_delete]:
                                    • the action to perform when the referenced column is delete-d
                                ON DELETE [wat_do]
                                    [wat_do]
                                        CASCADE     : remove children too
            DROP [column]
            MODIFY [column]
            RENAME COLUMN [string-1] TO [string-2]
            [var] = [value]
                AUTO_INCREMENT
        // --- CHECK constraint ---
> ALTER TABLE room ADD CONSTRAINT floor_num CHECK(floor BETWEEN -2 AND 100);


    DQL:    DQL:
        • "Data Query Langauge"
        SHOW [object]
            DATABASES
            TABLES
            COLUMNS FROM [table]
        DESCRIBE [table]
        Selecttion:
            [select] ([where] ([order]) ([limit]))
            [select] ([having] ([order]) ([limit]))
            [select]:           // note how select can select any value
                SELECT [subject] (AS <string>) FROM [table] (AS <string>)   : returns information according to [subject] from inside [table];
                                                                               see explanation of "AS" keywords BELOW as alias-ing
                    Aliasing:
                        [...] AS <string>       : gives [...] a temporary name for the span of the query; column names given by
                                                   functions are not very human readable, therefor its good practice to always alias these
                    [subject]:
                        [column]
                        ([rep])  *
                            [rep]:
                                ALL
                                DISTINCT
            [where]:
                WHERE [logic]   : specifies criteria to be met by selected data; [logic] is a sane logical
                                   expression composed of operator-s and data (see AT "../Operators")
            [having]:
                • as oppose to WHERE, its logic can contain aggregate functions
                HAVING [logic]
            [order]:
                ORDER BY [column] [aspect]
                    DESC
                    ASC
                    RAND()
            [limit]:
                LIMIT <int>         : limits the amount of records return-ed to a maximum of <int>
            GROUP BY
        View:
            CREATE VIEW [name] AS [selection];
            • a saved and named select query
            •  '*' expansion causes new columns to show up
        ○ examples
            { -- Most basic querry
> SELECT * FROM patient;
+----------------+----------------+------+------------+--------------+
| name           | biological_age | room | entered    | illness_type |
+----------------+----------------+------+------------+--------------+
| Alice          | 17             | 2    | 1866-07-01 | 10           |
| Kaffka Tibor   | 41             | 18   | 2020-04-05 | 6            |
| Mia            | 21             | 2    | 2132-02-17 | 2            |
| Mumus          | 45             | 2    | 2120-09-30 | 2            |
| Sam Lowry      | 38             | 22   | 1985-08-30 | 2            |
| Sitka          | 19             | 24   | 2019-09-04 | 7            |
| Tinta Vince    | 16             | 23   | 2006-01-06 | 1            |
| Tyler Durden   | 24             | 2    | 1999-10-15 | 9            |
| William Dickes | 43             | 11   | 1884-04-04 | 3            |
+----------------+----------------+------+------------+--------------+
-- -----------
-- DISTINT; list all rooms with someone living in it, discard duplicates
> SELECT DISTINCT p.room FROM patient p;
2
11
18
22
23
24
-- -----------
-- ORDER BY; list names in reverse alphabetical order
> SELECT p.name FROM patient p ORDER BY name DESC;
"William Dickes"
"Tyler Durden"
"Tinta Vince"
"Sitka"
"Sam Lowry"
"Mumus"
"Mia"
"Kaffka Tibor"
"Alice"
-- -------------
-- --- JOIN ---
    -- INNER
    > SELECT patient.name, room.floor FROM patient INNER JOIN room ON patient.room = room.num;
    Alice          | 1
    Mia            | 1
    Mumus          | 1
    Tyler Durden   | 1
    William Dickes | 2
    Kaffka Tibor   | 2
    Sam Lowry      | 2
    Tinta Vince    | 2
    Sitka          | 2
-- -------------
-- VIEW
> CREATE VIEW allp AS SELECT * FROM patient;
> SELECT * FROM allp;
...
-- HAVING; listing the room which have more than one occupant
> SELECT num FROM (room LEFT JOIN patient ON room.num = patient.room) HAVING COUNT(room.num) > 1;
-- -----------
-- NESTED SELECT; people who are older than the average
> SELECT name FROM patient WHERE biological_age > (SELECT AVG(biological_age) FROM patient);


    DCL:    DCL:
        • "Data Control Language"
        • deals with users and user permissions
        GRANT [what] TO [who]
            [what]:
                [role]
                [privilege] [subject]
                    [privileges]:
                        ALL PRIVILEGES
                        SELECT
                        UPDATE
                        DELETE
                    [subject]:
                        ON [database].[table]
                        ANY TABLE
            [who]:
                [user]
                [role]
        CREATE USER '[string-1]'@'[string-2]' IDENTIFIED BY '[string-3]'
        CREATE ROLE <string>

    TCL:    TCL:
        • "Transaction Control Language"
        • a transaction is group of safe SQL operations
        • if any one of the operations fail, the state of the database
          can be easily restored to the one before the transaction
        BEGIN                       : start a transaction
        SAVEPOINT <name>            : establish a state within a transaction which
                                       can also be rolled back to
        COMMIT                      : finish and finalize the transaction
        ROLLBACK (TO <savepoint>)   : restore the state to before the transaction
                                       or to a savepoint

    Functions:    Functions:
        RAND()
        NVL([value-1], [value-2])   : returns [value-1] unless it is NULL, in which case it returns [value-2]
        COALESCE([...])             : returns the first argument which is not NULL
        Column_oriented:
            COUNT([column])
            AVG([column])
            Number_specific:
                MIN([column])
                MAX([column])
                SUM([column])
        Num_oriented:
            ROUND([int-1], [int-2])     : returns [int-1] rounded to [int-2] nums of decimal places
            TRUNC([int-1], [int-2])     : returns [int-1] truncated to [int-2] nums of decimal places; [int-2] is free to me negative
            ABS([num])                  : returns the absolute value of [num]
            POWER([num], <int>)         : returns [num] raised to the <int>th power
            SQRT([num])                 : returns the square root of [num]
            Trigonometry:
                SIN([num])      : returns the sine
                COS([num])      : returns the cosine
                TAN([num])      : returns the tangent
                COT([num])      : returns the cotangent 
                ASIN([num])     : returns the arc sine
                ACOS([num])     : returns the arc cosine
                ATAN([num])     : returns the arc tangent 
        String_oriented:
            ASCII([char])                       : returns [char]s ascii code
            UPPER(<string>)                     : returns <string> converted to all upper case
            LOWER(<string>)                     : returns <string> converted to all lower case
            INITCAP(<string>)                   : returns <string> with initials capitalized
            LENGTH(<string>)                    : returns <string>s length; when operating on CHAR(<int>) the length is always equal to <int>
            TRIM(<string>)                      : returns <string> with all preceding and leading whitespaces removed
            TRIM([string-1] FROM [string-2])    : returns [string-2] with all preceding and leading chars contained in [string-1] removed
            REPLACE([string-1], [string-2])     : returns [string-1] with all occurrences of [string-2] delete-d
            REPLACE([string-1], [string-2], [string-3]) : returns [string-1] with all occurrences of [string-2] replaced with [string-3]
            SUBSTR(<string>, [int-1], [int-2])  : returns [int-2] chars from <string>s [int-1]th positions;
                                                   [int-1] is 1 indexed and negative values are counted from the back
            LPAD([string-1], <int>, [string-2]) : returns [string-1] cropped/left padded to <int> chars; [string-2] is the padding
            RPAD([string-1], <int>, [string-2]) : returns [string-1] cropped/right padded to <int> chars; [string-2] is the padding
            MD5(<string>)                       : returns md5 hash of <string>
        Date_oriented:
            STR_TO_DATE("<string>", "[format]")     : returns DATE constructed from <string> which is of format [format]
            DATE("<string>")                        : returns DATE constructed from <string> which is of format <string> is of the format "%Y-%m-%d"
            EXTRACT([dpart] FROM [DATE])            : return [dpart] value from [DATE]
            [dpart]([DATE])                         : return [dpart] value from [DATE]
            [dpart]:
                YEAR
                MONTH
                DAY
                HOUR
                MINUTE
                SECOND
            [format]:
                %a  : abbreviated weekday name (Sun to Sat)
                %b  : abbreviated month name (Jan to Dec)
                %c  : numeric month name (0 to 12)
                %D  : day of the month as a numeric value, followed by suffix (1st, 2nd, 3rd, ...)
                %d  : day of the month as a numeric value (01 to 31)
                %e  : day of the month as a numeric value (0 to 31)
                %f  : microseconds (000000 to 999999)
                %H  : hour (00 to 23)
                %h  : hour (00 to 12)
                %I  : hour (00 to 12)
                %i  : minutes (00 to 59)
                %j  : day of the year (001 to 366)
                %k  : hour (0 to 23)
                %l  : hour (1 to 12)
                %M  : month name in full (January to December)
                %m  : month name as a numeric value (01 to 12)
                %p  : aM or PM
                %r  : time in 12 hour AM or PM format (hh:mm:ss AM/PM)
                %S  : seconds (00 to 59)
                %s  : seconds (00 to 59)
                %T  : time in 24 hour format (hh:mm:ss)
                %U  : week where Sunday is the first day of the week (00 to 53)
                %u  : week where Monday is the first day of the week (00 to 53)
                %V  : week where Sunday is the first day of the week (01 to 53). Used with %X
                %v  : week where Monday is the first day of the week (01 to 53). Used with %X
                %W  : weekday name in full (Sunday to Saturday)
                %w  : day of the week where Sunday=0 and Saturday=6
                %X  : year for the week where Sunday is the first day of the week. Used with %V
                %x  : year for the week where Monday is the first day of the week. Used with %V
                %Y  : year as a numeric, 4-digit value
                %y  : year as a numeric, 2-digit value
        Case:
            CASE [logic] (THEN [value])+ (ELSE [value]) END
        ○ examples
            -- AVG
> SELECT AVG(p.biological_age) FROM patient p;
29.3333
-- -----------
-- COUNT
> SELECT COUNT(p.name) FROM patient p;
9
-- -----------
-- MIN/MAX
> SELECT MIN(p.biological_age) FROM patient p;
16
-- -----------
-- SUM
> SELECT SUM(p.biological_age) FROM patient p;
264
-- -----------
-- ROUND
> SELECT ROUND(AVG(p.biological_age), 0) FROM patient p;
29
-- -----------
-- STR_TO_DATE
> SELECT SUM(p.biological_age) FROM patient p;
"Alice"
"William Dickes"
-- -----------
-- CASE logic
> SELECT name, CASE WHEN biological_age > 35 THEN "old" ELSE "young" END AS "Youth" FROM patient;
+----------------+-------+
| name           | Youth |
+----------------+-------+
| Alice          | young |
| Kaffka Tibor   | old   |
| Mia            | young |
| Mumus          | old   |
| Sam Lowry      | old   |
| Sitka          | young |
| Tinta Vince    | young |
| Tyler Durden   | young |
| William Dickes | old   |
+----------------+-------+
-- -----------
-- New domain specific database/user
> CREATE DATABASE spring;
> CREATE USER 'spring'@'localhost' IDENTIFIED BY 'passwd';
> GRANT ALL PRIVILEGES ON spring.* TO 'spring'@'localhost';
> FLUSH PRIVILEGES;


    COMPLEX_EXAMPLES:    COMPLEX_EXAMPLES:
        -- Using the ANSII concatenation operator
    > SET sql_mode='PIPES_AS_CONCAT';
    > SELECT p.name || " is insame." AS Fact FROM patient p ORDER BY RAND() LIMIT 1;
    +------------------+
    | Fact             |
    +------------------+
    | Alice is insame. |
    +------------------+
-- Getting average of literals
    > SELECT SUM(t.s)/COUNT(t.s) FROM (SELECT 10 AS s UNION
                                        SELECT 90 AS s UNION
                                        SELECT 42 AS s) t;
    > SELECT AVG(t.s) FROM (SELECT 10 AS s UNION
                            SELECT 90 AS s UNION
                            SELECT 42 AS s) t;
-- Getting the name of the youngest
    > SELECT p.name FROM (SELECT * FROM patient ORDER BY biological_age LIMIT 1) as p;
    -- OR
    > SELECT p.name FROM patient p WHERE biological_age = (SELECT MIN(i.biological_age) FROM patient i);
    "Tinta Vince"
-- Getting the minimum and maximum biological age from patients
--  who entered durring the XXI. century
    > SELECT MIN(p.biological_age) AS "Min age", MAX(p.biological_age) AS "Max age"
        FROM patient p WHERE YEAR(p.entered) BETWEEN 2001 AND 2100;
    | Min age | Max age |
    | 16      | 41      |
-- Getting average age of floor 1
    > SELECT AVG(p.biological_age) "Average of floor 1"
        FROM (SELECT patient.biological_age, room.floor
                FROM patient INNER JOIN room ON patient.room = room.num) p
        WHERE p.floor = 1;
    26.7500
-- Getting average age by floor
    > SELECT AVG(p.biological_age) "Average age",
            p.floor
        FROM (SELECT patient.biological_age, room.floor
                FROM patient INNER JOIN room ON patient.room = room.num) p
        GROUP BY p.floor;
    26.7500  | 1
    31.4000  | 2
-- Getting a list of the psychopaths
    > SELECT p.name FROM patient p
        INNER JOIN illness_type t ON p.illness_type = t.id
        WHERE t.psychopaty = TRUE;
    "Tinta Vince"
    "Kaffka Tibor"
    "Sitka"
    "Tyler Durden"
-- Getting the number of occuped rooms
    > SELECT COUNT(DISTINCT p.room) FROM patient p;
    6


~~~~~~~~~~~~~~~~~~~~
    EMBEDDED SQL
~~~~~~~~~~~~~~~~~~~~

    "                                                               \
    When SQL was originally defined (at a time when NIH-PA Author   \
    Manuscript the concept of triggers did not exist) it lacked     \
    procedural constructs such as branching, loops and subroutines; \
    it was intended to be embedded in a host language that had      \
    these features.
    " - Guidelines for the effective use of entity–attribute–value modeling for biomedical databases

    • also made by IBM
    • the idea is to place literal SQL statements to source files
      of compiled programming languages
    • a special preprocessor generates regular source code with library calls
      and macros substituted to the place of the SQL statements

    Support:
        DB2:
            dclgen
            • "IBM Database 2"
            • embedding is part of the toolkit
            ○ languages
                • C/C++
                • FORTRAN
                • COBOL
                • REXX
        DBMS:
            • no info on how the tool chain is implemented
            ○ languages
                • C/C++
                • COBOL
                • MUMPS
        PostgreSQL:
            ecpg
            • seems to be the only FOSS implementation in existence
            ○ languages
                • C/C++
        Pro*C:
            • for Oracle Database

    
    Extensions:
        .ec(pp)     : C source with embedded SQL; refers to ESQL/C(++)
        .sqc        : C source with embedded SQL; combination of SQL and C
        .pgc        : C source with embedded SQL; PostgreSQL specific


    Syntax:
        <start_token> <statement> <end_token>
            <start_token>  -  EXEC SQL
            <statement>    -  spiced SQL statement; see BELOW
            <end_token>    -  ';' not inside a string literal

        Special_Statements:
            CONNECT TO <db> AS <connection>;
            DISCONNECT <con>
                <connection>    : one specific connection
                ALL             : every connection

        Variables:
            :<name>
            • native language variable interfering with sql

    Example:
        $ psql -U anon -d testdb
psql (15.2)
Type "help" for help.

testdb=> \d test
                  Table "public.test"
 Column |     Type     | Collation | Nullable | Default
--------+--------------+-----------+----------+---------
 i      | integer      |           |          |
 text   | character(4) |           |          |

testdb=> select * from test;
 i | text
---+------
 1 | aaaa
 3 | bbbb
(2 rows)

        // @BAKE ecpg $@
// embeded_test.sqc
#include <stdio.h>

signed main(){
    EXEC SQL BEGIN DECLARE SECTION;
    int i;
    char c[4];
    EXEC SQL END DECLARE SECTION;

    EXEC SQL CONNECT TO testdb AS myconnection;
    EXEC SQL SELECT i, text INTO :i, :c FROM test WHERE i = 1;

    puts(c);

    return i;
}


--------------
    SQLgen
--------------

    • "SQL GENerator"
    • 90's
    • library to ease creating graphical clients for databases
    • metadata itself is specified as SQL
    {
            User
             A
             |
          +--V---- Client Application -----------------+
          |                        +------------------+|
          | application specific   |  SQLgen metadata ||
          |        objects         |        |         ||
          |                     <-----> libSQLgen     ||
          |                        +--------A---------+|
          +---------------------------------|----------+
                                            |
                                            |
                                  +---------V----------+
                                  |   Database server  |
                                  +--------------------+
    }


nosql
#define nosql:: \
I----------------------------------------------------------\
I----------------------------------------------------------\
I----------------------------------------------------------\
I     /$$   /$$            /$$$$$$   /$$$$$$  /$$          \
I    | $$$ | $$           /$$__  $$ /$$__  $$| $$          \
I    | $$$$| $$  /$$$$$$ | $$  \__/| $$  \ $$| $$          \
I    | $$ $$ $$ /$$__  $$|  $$$$$$ | $$  | $$| $$          \
I    | $$  $$$$| $$  \ $$ \____  $$| $$  | $$| $$          \
I    | $$\  $$$| $$  | $$ /$$  \ $$| $$/$$ $$| $$          \
I    | $$ \  $$|  $$$$$$/|  $$$$$$/|  $$$$$$/| $$$$$$$$    \
I    |__/  \__/ \______/  \______/  \____ $$$|________/    \
I                                        \__/              \
I----------------------------------------------------------\
I----------------------------------------------------------\
I----------------------------------------------------------I

    • umbrella term for databases which do no support SQL,
      but are employed in similar environments
    • primarily vertical databases created by developers fed-up with EAV
    
        • real use-case
    
        • bad implementations

    Mongodb:
        • nudev solution for data management
        • uses OO javascript
        • each record is stored as JSON
        • no tables, only collections which are arbitrary clusters of records
        •  not recommended
        Programs:
            mongodb     : daemon
            mongo       : shell
        Language:
            • based on Js
            use <database>      : select the current database to <databases>;
                                   if no such databse exists, its created
            namespace db
                <database>
                    insertMany(list)
                    find(object)