databases

#define databases: \ I--------------------------------------------------------------------------------------------------\ I--------------------------------------------------------------------------------------------------\ I--------------------------------------------------------------------------------------------------\ I /$$$$$$$ /$$ /$$ \ I | $$__ $$ | $$ | $$ \ I | $$ \ $$ /$$$$$$ /$$$$$$ /$$$$$$ | $$$$$$$ /$$$$$$ /$$$$$$$ /$$$$$$ /$$$$$$$ \ I | $$ | $$ |____ $$|_ $$_/ |____ $$| $$__ $$ |____ $$ /$$_____/ /$$__ $$ /$$_____/ \ I | $$ | $$ /$$$$$$$ | $$ /$$$$$$$| $$ \ $$ /$$$$$$$| $$$$$$ | $$$$$$$$| $$$$$$ \ I | $$ | $$ /$$__ $$ | $$ /$$ /$$__ $$| $$ | $$ /$$__ $$ \____ $$| $$_____/ \____ $$ \ I | $$$$$$$/| $$$$$$$ | $$$$/| $$$$$$$| $$$$$$$/| $$$$$$$ /$$$$$$$/| $$$$$$$ /$$$$$$$/ \ I |_______/ \_______/ \___/ \_______/|_______/ \_______/|_______/ \_______/|_______/ \ I--------------------------------------------------------------------------------------------------\ I--------------------------------------------------------------------------------------------------\ I--------------------------------------------------------------------------------------------------\ I--------------------------------------------------------------------------------------------------I • see more AT "/SQL" • most database engines have adopted a client-server architecture, which is the right call in business settings, however databases are common in standalone desktop applications too where it makes very little sense; please question what you are doing before shooting yourself in the foot without ever even realizing there is a less painful way; as of 2025 your only option is sqlite ○ types • tree {filesystems} • graph • relational {sql} • nosql (non-relational) {JSON databases} • most things here implicitly apply to relational databases; because that's by far the most widely used database model Table: • structure of arrays • a predefined list of attributes and entries • table names are usually placed to plural {"users", "tickets"} • technically just a bunch of struct-s (with some implementation details), but its generally better to imagine as a vertical list • a record is an entry inside a table • using a terminology for those familiar with OOP, a table is like a class, a record is like an instance { Table "users" | ID | Name | Email | +----+-----------+--------------------+ • 0 emil admin@chud.cyou • 1 xolatile xolatile@chud.cyou } Dependency: • when a value can be inferred from another value • generally a sign of poor database design, however rarely, its deliberate to address performance concerns { // Dependency special case example // ?!; maybe move /* usually emails are stored as simple strings. * this is normalized to the degree that such is generally * not considered a 1NF violation. * however one could have a use-case for storing them as such: */ | user | domain | top_level_domain | • anon chud cyou /* but we must have a wholistic view of our operations: * >querying based on components is rare * >fetching whole emails is common * keep in mind that to yield email addresses we will have to * reconstruct them at some point * therefor, as a result of our database architecture plan, * common operations would take a performance hit * the first thing to ask is "do i really need to store email components?" * and if the result is a confident "yes, absolutely", * then the following is your solution, even tho generally you should NOT do it: */ | email | user | domain | top_level_domain | • anon@chud.cyou anon chud cyou } Functional: • 2 attribute's values have a one-to-one correspondence • "egyértelmű hozzárendelés"^HU Transitive: • when a dependency has a dependency • indirect functional dependency { A->B and B->C therefor A->C } Relationships: • a record expressing belonging to another record (which could be from a different table) • some pointer will have to be stored, this could be simply an ID number or a managed pointer (called a foreign key), depending on the database and application in question one_to_one: • one record belongs to exactly one other record /* Assuming every user has to register with an email, * and one email can be used only for one registration. */ Table "users" Table "emails" | Username | Email | | ID | Name | Domain | +----------+-------+ +----+------+------------+ > emil 1 ────▶ 1 admin chud.cyou one_to_many: • the way to express lists purely with relationships • multiple records will be pointing to the same record Table "comments" Table "posts" | ID | Text | Post | | ID | Text | > 1 a... 1 ──┬─▶ > 1 ... > 2 b... 1 ──┤ > 3 c... 1 ──┘ many_to_many: • lists belong to lists • a pivot table is a table expressing only relationships • a pivot table is required to express many-to-many relationships in relational databases • pivot tables are usually not displayed when creating database graphs • the name of a pivot table is usually the combination of the tables it joins "students" "studentsCourses" "courses" | ID | | student | course | | ID | > 1 1 1 > 1 > 2 1 2 > 2 > 3 2 3 > 3 3 3 /* The arrows are omitted to prevent confusion. * The way this can be read is: * > student one took course one and two * > course three was taken by student two and three * Notice how the order it is put is irrelevant, * there is no inherent one way or other. */ zero_or_one_to_one:"0/1-1" • pattern • null-ed foreign key dependent on another field Key: • a key is a set of attributes by which a record is uniquely identifiable • one or more columns of the original data set or an added id (num) row Superkey: • the set of attribute sets which can uniquely identify a record • cannot contain attributes with possible NULL values • the element of the super-key consisting of the most attributes is called the trivial superkey • all attributes excluded from the super-key form a candidate key Candidate_key: • all elements of the super-key which have no real subsets in the super-key Primary_key: conventionally, when a table is displayed, the primary key is always the first column • many modern database engines always add an implicit primary if one was not specified • a freely chosen candidate key or a newly added id row which represents a freely chosen candidate key • a natural key is a candidate key chosen to be the primary key • an artifical key is a newly given, enum-erating primary key — adding an id row is prefered when: • the chosen candidate key consists of multiple attributes • if the candidate key is an attribute holding complex data (long strings which are relatively slow to read) • id rows are most often incremental, ie. each new elements id is the previous ones + 1 View: • virtual table — simplifies things: • queries become shorter • joins become transparent • allows for row oriented security Normal_forms: 0.: • anything that is not in 1st normal form 1.: • primary keys are present • every field is atomic ie. at no point will have to be split into multiple parts by a user • normalizing a data structure from 0th normal form strictly to 1st normal form doesnt result in any new tables being created // People table; we know that parts of the address will be handled independently Name | Age | Address | Anon | 18 | 1204 Budapest, Nagysándor József utca 93 | [...] // Therefor we organize the individual parts into different columns Name | Age | Post code | City | Address | Address number | Anon | 18 | 1204 | Budapest | Nagysándor József utca | 93 | [...] 2.: • auto when the primary key is a single attribute • functional dependencies of the primary key are eliminated 3.: • all candidate key related functional dependencies are eliminated 3.5. (Boyce-Codd): • all functional dependencies are eliminated • a table in 3 normal form without multiple overlapping candidate keys is always also in 3.5. normal form • what you should generally aim for 4.: pass 5.: pass Object_diagramm: Attribute: // its name and optionally its type in a ellipse .--------------. / \ ( [name] ) \ / ^--------------^ Multi_value_attribute: • list/set/etc. // attribute with an additional ellipse .--------------. / .------------. \ ( ( [name] ) ) \ ^------------^ / ^--------------^ Key: // attribute with the inner text underlined .--------------. / \ ( đ[name]đ ) \ / ^--------------^ Partial_key: // attribute with the inner text non-continuously underlined .--------------. / \ ( [name] ) \ ‾ ‾ ‾ ‾ / ^--------------^ struct: • attribute made up from multiple attributes // 2 or more attributes connected with a continuous line .--------------. .--------------. / \ / \ ( [att1] )──────────( [att2] ) \ / \ / ^--------------^ ^--------------^ Entity: • connects attributes // rectangle .--------------. +----------------+ .--------------. / \ | | / \ ( [att1] )──────────| [name] |──────────( [att2] ) \ / | | \ / ^--------------^ +----------------+ ^--------------^ │ │ .--------------. / \ ( [att2] ) \ / ^--------------^ Relation: • always connected to atleast 2 structs with continuous lines • the num of attributes in a relation is called its degree // rhombus _-^-_ .--------------. _-^ ^-_ .--------------. / \ _-^ ^-_ / \ ( [att1] )──────────< [name] >──────────( [att2] ) \ / ^-_ _-^ \ / ^--------------^ ^-_ _-^ ^--------------^ ^-_-^ Vertical_database: • used for storing highly heterogeneous and or dynamic data • common is scientific research and electronic patient record systems — where you want a database, but have little to no clue about the schema: • the data-safety is a must • you wish to have external support from a big company • new things keep popping up EAV: https://www.cedanet.com.au/antipatterns/eav.php https://www.red-gate.com/simple-talk/opinion/opinion-pieces/bad-carma/ "\ In other words, EAV gives you enough rope to hang yourself and \ in this industry, things should be designed to the lowest level \ of complexity because the guy replacing you on the project will \ likely be an idiot. \ ""Entity-Attribute-Value"/"Object-Attribute-Value""implementing a database inside a database""column to row transformation" (called pivoting) • pattern • vertical databases implemented in a relational database system • widely frowned upon • no FOSS implementations at all • under EAV most queries will require procedural code • an abstraction layer must be provided to hide the complexity • a query generator will have to be implemented • comes up when the requirements are inadequate to create an application (due to poor communication or otherwise) • is a relational database engine • adding new types shifts into an administrative responsibility • hard to maintain • hard to comprehend • slow • requires much work up-front • administrators are usually not very good at adding new types — by-passes many builtin relational database features: • constraints • disk optimization • query optimization • views • caching • time-sharing • trigger based validation entity: • all entities are stored in the same table • the non-descriptive name "entity" is used on purpose, because it marks anything from living beings to experiment findings • all entities reference a metadata (see BELOW) • multiple entity entries form a single coherent record CREATE TABLE entity ( id INT PRIMARY KEY, attibute INT FOREIGN KEY(metadata), value string ); • optionally the entity could be a foreign key itself, referencing a conventional table {there is a users table with their rudimentary personal data, but their other "belongings" are stored in EAV } attribute: • metadata • stores the type of an entity • no standard way to implement it — rough approximation of columns: • validation {type; range; regex} • GUI presentation • normal value (in a clinical sense) • grouping (parent/sibling relationship for something like a form) • description • keywords (for searching) value: • the concrete data • unlike the rest, it would be stored in a field normally too • the simplest implementation is to use a string • could come with an extra field, which marks what table the value is a foreign key to; ie. you could have a value table for each SQL datatype, and the value column reference one of them EAV_CR: PMID: 10579606 • "EAV with Classes and Relationships" • the OOP of EAV • allows for nesting • the Value becomes the foreign key of a class table

sql

#define sql:: \ I------------------------------------------------------\ I------------------------------------------------------\ I------------------------------------------------------\ I /$$$$$$ /$$$$$$ /$$ \ I /$$__ $$ /$$__ $$| $$ \ I | $$ \__/| $$ \ $$| $$ \ I | $$$$$$ | $$ | $$| $$ \ I \____ $$| $$ | $$| $$ \ I /$$ \ $$| $$/$$ $$| $$ \ I | $$$$$$/| $$$$$$/| $$$$$$$$ \ I \______/ \____ $$$|________/ \ I \__/ \ I------------------------------------------------------\ I------------------------------------------------------\ I------------------------------------------------------I "Structured Query Language" • database management language • initially developed by IBM • ANSI standard • attempted to be usable by businessmen, hence the aspiration to resemble natural language; failed miserably as such aspirations repeatedly do • structurally and logically good, the language semantics are highly questionable, but well legible given practice • SQL is split into multiple "component" languages; these are very similar in logic and syntax, but are concerned with different aspects of database management; each is detailed seperately Example_Data: • the following describes an insane asylum • in this hypothetical place patients dont age after entry +====================================================================+ | Patient | +====================================================================+ | name | biological_age | room | entered | illness_type | +====================================================================+ | Alice | 17 | 2 | 1866-07-01 | 10 | | Kaffka Tibor | 41 | 18 | 2020-04-05 | 6 | | Mia | 21 | 2 | 2132-02-17 | 2 | | Mumus | 45 | 2 | 2120-09-30 | 2 | | Sam Lowry | 38 | 22 | 1985-08-30 | 2 | | Sitka | 19 | 24 | 2019-09-04 | 7 | | Tinta Vince | 16 | 23 | 2006-01-06 | 1 | | Tyler Durden | 24 | 2 | 1999-10-15 | 9 | | William Dickes | 43 | 11 | 1884-04-04 | 3 | +----------------+----------------+------+------------+--------------+ +===============================================================+ I Illness States I +===============================================================+ I psychopaty | delusion | obsession | ADHD | schizophrenia | id I +===============================================================+ | 1 | 0 | 0 | 0 | 0 | 1 | | 0 | 1 | 0 | 0 | 0 | 2 | | 0 | 0 | 1 | 0 | 0 | 3 | | 0 | 0 | 0 | 1 | 0 | 4 | | 0 | 0 | 0 | 0 | 1 | 5 | | 1 | 1 | 0 | 0 | 0 | 6 | | 1 | 0 | 0 | 1 | 0 | 7 | | 1 | 1 | 0 | 1 | 0 | 8 | | 1 | 0 | 1 | 0 | 1 | 9 | | 0 | 1 | 0 | 0 | 1 | 10 | +------------+----------+-----------+------+---------------+----+ +====================+ | Room | +====================+ | num | floor | beds | +====================+ | 2 | 1 | 4 | | 11 | 2 | 3 | | 18 | 2 | 2 | | 22 | 2 | 1 | | 23 | 2 | 1 | | 24 | 2 | 1 | | 25 | 2 | 1 | +-----+-------+------+ ○ you may reproduce it with the following (in MySQL) { CREATE DATABASE Asylum; CREATE TABLE patient (name VARCHAR(128) DEFAULT "N/A" NOT NULL, biological_age INT NOT NULL, room INT, entered DATE DEFAULT CURRENT_TIMESTAMP NOT NULL, illness_type INT NOT NULL); CREATE TABLE room (num INT not NULL, floor INT DEFAULT 1 NOT NULL, beds INT DEFAULT 1); CREATE TABLE illness_type (psychopaty BOOLEAN DEFAULT FALSE, delusion BOOLEAN DEFAULT FALSE, obsession BOOLEAN DEFAULT FALSE, ADHD BOOLEAN DEFAULT FALSE, schizophrenia BOOLEAN DEFAULT FALSE); ALTER TABLE patient ADD PRIMARY KEY(name); ALTER TABLE room ADD PRIMARY KEY(num); ALTER TABLE illness_type ADD id INT AUTO_INCREMENT PRIMARY KEY; INSERT INTO room (num, floor, beds) VALUES (2, 1, 4), (11, 2, 3), (18, 2, 2), (22, 2, 1), (23, 2, 1), (24, 2, 1), (25, 2, 1); INSERT INTO illness_type (psychopaty, delusion, obsession, ADHD, schizophrenia) VALUES (TRUE, FALSE, FALSE, FALSE, FALSE), (FALSE, TRUE, FALSE, FALSE, FALSE), (FALSE, FALSE, TRUE, FALSE, FALSE), (FALSE, FALSE, FALSE, TRUE, FALSE), (FALSE, FALSE, FALSE, FALSE, TRUE), (TRUE, TRUE, FALSE, FALSE, FALSE), (TRUE, FALSE, FALSE, TRUE, FALSE), (TRUE, TRUE, FALSE, TRUE, FALSE), (TRUE, FALSE, TRUE, FALSE, TRUE), (FALSE, TRUE, FALSE, FALSE, TRUE); INSERT INTO patient (name, biological_age, room, entered, illness_type) VALUES ("William Dickes", 43, 11, DATE("1884-04-04"), 3), ("Tinta Vince", 16, 23, DATE("2006-01-06"), 1), ("Mumus", 45, 2, DATE("2120-09-30"), 2), ("Kaffka Tibor", 41, 18, DATE("2020-04-05"), 6), ("Alice", 17, 2, DATE("1866-07-01"), 10), ("Sitka", 19, 24, DATE("2019-09-04"), 7), ("Tyler Durden", 24, 2, DATE("1999-10-15"), 9), ("Mia",21, 2, DATE("2132-02-17"), 2), ("Sam Lowry", 38, 22, DATE("1985-08-30"), 2); ALTER TABLE patient ADD CONSTRAINT FOREIGN KEY (illness_type) REFERENCES illness_type(id); ALTER TABLE patient ADD CONSTRAINT FOREIGN KEY (room) REFERENCES room(num); PROGRAMS: ---------------------- sqlite: ---------------------- • distributed model • reads/writes directly to memory • can use (single) files, runtime memory or be embedded into the executable sqlite3 [options] [FILE] [sql] : interactive sqlite shell Common_commands: • these are special directives for the sqlite shell .shell [...] : access the system shell .schema [table] : get the CREATE command printed for a table .exit .quit "C/3th party libraries/sqlite3" -------------------------- postgresql: -------------------------- • server-client model psql [options] -U <username> -d <database> : interactive postgresql shell Commands: \d <table> : describe <table> --------------------- mysql: --------------------- • server-client model • running mariadb mysql [...] Set_up: . ### Running Mariadb ### 1. Install the mariadb package 2. Install mariadb // necessery only once on a given system mariadb-install-db --user=mysql --basedir=/usr --datadir=/var/lib/mysql 3. Run the daemon systemctl start mariadb # ### Using MySQL as a normal user ### • since MySQL uses an internal system for identifying users a few hoops must be jumped • root is always preregistered 1. Log in as root $ sudo mysql 2. Create a new user CREATE USER '<your_name>'@'localhost' IDENTIFIED BY '<password>'; • <password> doesnt have to be the same as the user's system login password 3. Grant privileges to the new user GRANT ALL PRIVILEGES ON *.* TO '<your_name>'@'localhost'; FLUSH PRIVILEGES; • this grants all privileges, which may not be a very bright idea on a shared system; granting partial privileges is recommended and detailed BELOW 4. Exit exit 5. Log in as yourself mysql --user=<your_username> -p — the following alias is recommended (see AT "/Bash/Builtins/alias") alias mysql="mysql --user=${USER} -p" # mysql [options] : starts (interactive) sql client --host=[domain] : sets host --port=<int> : sets port --user=<string> : specifies username to use --password=<string> : specifies pass-word to use — p : prompt for pass-word; recommended over "--password" (see ABOVE) GNU Readline library : used for all input operations (see AT "C++/Readline\/History") Files: .mysql_history : contains the list of previously interactively entered lines Environment_variabes: ${EDITOR} : read by the "edit" command (see BELOW) ${MYSQL_PS1} : prompt string 1; the string which is displayed each time before the interpreter allows for input Commands: edit || \e : opens a temporary file with the users set ${EDITOR}; uppon reaching the next ';' the text from this temporary file is run as a script source [file] : run [file] as a script use [database] : selects [database] as the database to operate on exit : terminate normally mycli [options] : mysql wrapper with auto completion and syntax highlighting flag -p is a pitfall, see BELOW — p <string> : pass authentication password as <string>; if ommited, securely asking on stdin is the default — u <string> : pass user to use as <string>; the default is mysql ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Other_implementations: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +---------------------------+---------------------------+ | Parent Company | Database | +---------------------------+---------------------------+ | Microsoft Corporation | Microsoft SQL Server | | Oracle Corporation | Oracle Database | | IBM | IBM Db2 | | Amazon Web Services (AWS) | Amazon Aurora | | Google | Google Cloud Spanner | +---------------------------+---------------------------+ //------------------------------------ //------------------------------------ FILES: .sql : SQL script .pls : PL/SQL script .pks : SQL package .pkb : SQL package body .tab : SQL script for table creation .tbl : SQL script for table creation COMMENTS: -- <string> : single line comment SEMICOLONS: • all sql statements are terminated by a single semicolon ';' • sql statements are not run until a semicolon is encountered • interpreter commands (see ABOVE) however dont take semicolons {exit} typenames: Logical: BIT BOOLEAN Numeric: Integer: +-----------+---------+--------------------+----------------+ | Name | Storage | Signed Range | Unsigned Range | +-----------+---------+--------------------+----------------+ | TINYINT | 1 | -128, 127 | 0, 255 | | SMALLINT | 2 | -32768, 32767 | 0, 65535 | | MEDIUMINT | 3 | -8388608, 8388607 | 0, 16777215 | | INTEGER | 4 | -2^31, 2^31-1 | 0, 2^32-1 | | BIGINT | 8 | -2^63, 2^63-1 | 0, 2^64-1 | +-----------+---------+--------------------+----------------+ DECIMAL NUMERIC FLOAT REAL DOUBLE PERCISION Time: DATE TIME DATETIME TIMESTAMP YEAR — zero values: +------------+---------------------+ | Data Type | “Zero” Value | +------------+---------------------+ | DATE | 0000-00-00 | | TIME | 00:00:00 | | DATETIME | 0000-00-00 00:00:00 | | TIMESTAMP | 0000-00-00 00:00:00 | | YEAR | 0000 | +------------+---------------------+ String: CHAR(<int>) : <int> long char array; uninitialized indexes are filled in with a specific value (usually spaces); used when the data is known to have a consistent length VARCHAR(<int>) : "VARiable CHAR"; <int> long char array; uninitialized indexes are ignored, however preserved regardless BINARY VARBINARY BLOB TEXT ENUM SET OPERATORS: General: Logical: [column] = [value] : equals; SQL uses a single '=' instead of 2 for comparisons oppose to most computer languages; however its only logical as it doesnt have an assignment operator [column] != [value] : not equals [column] > [value] : greater than [column] < [value] : lesser than [column] >= [value] : greater than or equals to [column] <= [value] : lesser than or equals to [column] BETWEEN [value-1] AND [value-2] NOT [bool] [bool-1] AND [bool-2] [bool-1] OR [bool-2] [string-1] LIKE [string-2] : regex match [regex]: % : match any number of chars _ : match exactly one char Numeric: — + * / % MOD String: || : concatenation; support is disabled by default; XXX: SET sql_mode='PIPES_AS_CONCAT'; IN: pass ○ examples: -- LIKE; select patients with 'y' char in their name; -- NOTE: normally either LOWER() or UPPER would be -- called for humanly logical results, -- however the example tries to keep it simple > SELECT p.name FROM patient p WHERE p.name LIKE '%y%'; -- ----------- -- --- IN --- > DELETE FROM patient WHERE biological_age IN (SELECT biological_age FROM patient p WHERE p.biological_age > 40); DDL:"Data Definition Language" CREATE [object] : creates [object] DATABASE <string> : database named <string> TABLE <string> ([decl-1(, [decl-num])*]) : table named <string> with columns defined by [decl-uint] [decl]: <string> <typename>( DEFAULT [value]( NOT NULL)) [value]: <int> "<string>" CURRENT_TIMESTAMP DML:"Data Manipulation Language" DELETE FROM [table] ([where]) : not specifying [where] will select (ie. delete) all rows DROP [object] <string> INSERT INTO [table] ([columns]+) VALUES ([value]+) UPDATE [table] SET ([column] = [value])+ ([where]) ALTER TABLE <string> [cmd] ADD [?!] [column] PRIMARY KEY([column name]) CONSTRAINT [constraint] [constraint] <string> [type] [type] [column_type] [column]+ PRIMARY KEY UNIQUE [check_type] CHECK([logic]) FOREIGN KEY [column] REFERENCES [column] [on_delete] [on_delete]: • the action to perform when the referenced column is delete-d ON DELETE [wat_do] [wat_do] CASCADE : remove children too DROP [column] MODIFY [column] RENAME COLUMN [string-1] TO [string-2] [var] = [value] AUTO_INCREMENT // --- CHECK constraint --- > ALTER TABLE room ADD CONSTRAINT floor_num CHECK(floor BETWEEN -2 AND 100); DQL:"Data Query Langauge" SHOW [object] DATABASES TABLES COLUMNS FROM [table] DESCRIBE [table] Selecttion: [select] ([where] ([order]) ([limit])) [select] ([having] ([order]) ([limit])) [select]: // note how select can select any value SELECT [subject] (AS <string>) FROM [table] (AS <string>) : returns information according to [subject] from inside [table]; see explanation of "AS" keywords BELOW as alias-ing Aliasing: [...] AS <string> : gives [...] a temporary name for the span of the query; column names given by functions are not very human readable, therefor its good practice to always alias these [subject]: [column] ([rep]) * [rep]: ALL DISTINCT [where]: WHERE [logic] : specifies criteria to be met by selected data; [logic] is a sane logical expression composed of operator-s and data (see AT "../Operators") [having]: • as oppose to WHERE, its logic can contain aggregate functions HAVING [logic] [order]: ORDER BY [column] [aspect] DESC ASC RAND() [limit]: LIMIT <int> : limits the amount of records return-ed to a maximum of <int> GROUP BY View: CREATE VIEW [name] AS [selection]; • a saved and named select query • '*' expansion causes new columns to show up ○ examples { -- Most basic querry > SELECT * FROM patient; +----------------+----------------+------+------------+--------------+ | name | biological_age | room | entered | illness_type | +----------------+----------------+------+------------+--------------+ | Alice | 17 | 2 | 1866-07-01 | 10 | | Kaffka Tibor | 41 | 18 | 2020-04-05 | 6 | | Mia | 21 | 2 | 2132-02-17 | 2 | | Mumus | 45 | 2 | 2120-09-30 | 2 | | Sam Lowry | 38 | 22 | 1985-08-30 | 2 | | Sitka | 19 | 24 | 2019-09-04 | 7 | | Tinta Vince | 16 | 23 | 2006-01-06 | 1 | | Tyler Durden | 24 | 2 | 1999-10-15 | 9 | | William Dickes | 43 | 11 | 1884-04-04 | 3 | +----------------+----------------+------+------------+--------------+ -- ----------- -- DISTINT; list all rooms with someone living in it, discard duplicates > SELECT DISTINCT p.room FROM patient p; 2 11 18 22 23 24 -- ----------- -- ORDER BY; list names in reverse alphabetical order > SELECT p.name FROM patient p ORDER BY name DESC; "William Dickes" "Tyler Durden" "Tinta Vince" "Sitka" "Sam Lowry" "Mumus" "Mia" "Kaffka Tibor" "Alice" -- ------------- -- --- JOIN --- -- INNER > SELECT patient.name, room.floor FROM patient INNER JOIN room ON patient.room = room.num; Alice | 1 Mia | 1 Mumus | 1 Tyler Durden | 1 William Dickes | 2 Kaffka Tibor | 2 Sam Lowry | 2 Tinta Vince | 2 Sitka | 2 -- ------------- -- VIEW > CREATE VIEW allp AS SELECT * FROM patient; > SELECT * FROM allp; ... -- HAVING; listing the room which have more than one occupant > SELECT num FROM (room LEFT JOIN patient ON room.num = patient.room) HAVING COUNT(room.num) > 1; -- ----------- -- NESTED SELECT; people who are older than the average > SELECT name FROM patient WHERE biological_age > (SELECT AVG(biological_age) FROM patient); DCL:"Data Control Language" • deals with users and user permissions GRANT [what] TO [who] [what]: [role] [privilege] [subject] [privileges]: ALL PRIVILEGES SELECT UPDATE DELETE [subject]: ON [database].[table] ANY TABLE [who]: [user] [role] CREATE USER '[string-1]'@'[string-2]' IDENTIFIED BY '[string-3]' CREATE ROLE <string> TCL:"Transaction Control Language" • a transaction is group of safe SQL operations • if any one of the operations fail, the state of the database can be easily restored to the one before the transaction BEGIN : start a transaction SAVEPOINT <name> : establish a state within a transaction which can also be rolled back to COMMIT : finish and finalize the transaction ROLLBACK (TO <savepoint>) : restore the state to before the transaction or to a savepoint Functions: RAND() NVL([value-1], [value-2]) : returns [value-1] unless it is NULL, in which case it returns [value-2] COALESCE([...]) : returns the first argument which is not NULL Column_oriented: COUNT([column]) AVG([column]) Number_specific: MIN([column]) MAX([column]) SUM([column]) Num_oriented: ROUND([int-1], [int-2]) : returns [int-1] rounded to [int-2] nums of decimal places TRUNC([int-1], [int-2]) : returns [int-1] truncated to [int-2] nums of decimal places; [int-2] is free to me negative ABS([num]) : returns the absolute value of [num] POWER([num], <int>) : returns [num] raised to the <int>th power SQRT([num]) : returns the square root of [num] Trigonometry: SIN([num]) : returns the sine COS([num]) : returns the cosine TAN([num]) : returns the tangent COT([num]) : returns the cotangent ASIN([num]) : returns the arc sine ACOS([num]) : returns the arc cosine ATAN([num]) : returns the arc tangent String_oriented: ASCII([char]) : returns [char]s ascii code UPPER(<string>) : returns <string> converted to all upper case LOWER(<string>) : returns <string> converted to all lower case INITCAP(<string>) : returns <string> with initials capitalized LENGTH(<string>) : returns <string>s length; when operating on CHAR(<int>) the length is always equal to <int> TRIM(<string>) : returns <string> with all preceding and leading whitespaces removed TRIM([string-1] FROM [string-2]) : returns [string-2] with all preceding and leading chars contained in [string-1] removed REPLACE([string-1], [string-2]) : returns [string-1] with all occurrences of [string-2] delete-d REPLACE([string-1], [string-2], [string-3]) : returns [string-1] with all occurrences of [string-2] replaced with [string-3] SUBSTR(<string>, [int-1], [int-2]) : returns [int-2] chars from <string>s [int-1]th positions; [int-1] is 1 indexed and negative values are counted from the back LPAD([string-1], <int>, [string-2]) : returns [string-1] cropped/left padded to <int> chars; [string-2] is the padding RPAD([string-1], <int>, [string-2]) : returns [string-1] cropped/right padded to <int> chars; [string-2] is the padding MD5(<string>) : returns md5 hash of <string> Date_oriented: STR_TO_DATE("<string>", "[format]") : returns DATE constructed from <string> which is of format [format] DATE("<string>") : returns DATE constructed from <string> which is of format <string> is of the format "%Y-%m-%d" EXTRACT([dpart] FROM [DATE]) : return [dpart] value from [DATE] [dpart]([DATE]) : return [dpart] value from [DATE] [dpart]: YEAR MONTH DAY HOUR MINUTE SECOND [format]: %a : abbreviated weekday name (Sun to Sat) %b : abbreviated month name (Jan to Dec) %c : numeric month name (0 to 12) %D : day of the month as a numeric value, followed by suffix (1st, 2nd, 3rd, ...) %d : day of the month as a numeric value (01 to 31) %e : day of the month as a numeric value (0 to 31) %f : microseconds (000000 to 999999) %H : hour (00 to 23) %h : hour (00 to 12) %I : hour (00 to 12) %i : minutes (00 to 59) %j : day of the year (001 to 366) %k : hour (0 to 23) %l : hour (1 to 12) %M : month name in full (January to December) %m : month name as a numeric value (01 to 12) %p : aM or PM %r : time in 12 hour AM or PM format (hh:mm:ss AM/PM) %S : seconds (00 to 59) %s : seconds (00 to 59) %T : time in 24 hour format (hh:mm:ss) %U : week where Sunday is the first day of the week (00 to 53) %u : week where Monday is the first day of the week (00 to 53) %V : week where Sunday is the first day of the week (01 to 53). Used with %X %v : week where Monday is the first day of the week (01 to 53). Used with %X %W : weekday name in full (Sunday to Saturday) %w : day of the week where Sunday=0 and Saturday=6 %X : year for the week where Sunday is the first day of the week. Used with %V %x : year for the week where Monday is the first day of the week. Used with %V %Y : year as a numeric, 4-digit value %y : year as a numeric, 2-digit value Case: CASE [logic] (THEN [value])+ (ELSE [value]) END ○ examples -- AVG > SELECT AVG(p.biological_age) FROM patient p; 29.3333 -- ----------- -- COUNT > SELECT COUNT(p.name) FROM patient p; 9 -- ----------- -- MIN/MAX > SELECT MIN(p.biological_age) FROM patient p; 16 -- ----------- -- SUM > SELECT SUM(p.biological_age) FROM patient p; 264 -- ----------- -- ROUND > SELECT ROUND(AVG(p.biological_age), 0) FROM patient p; 29 -- ----------- -- STR_TO_DATE > SELECT SUM(p.biological_age) FROM patient p; "Alice" "William Dickes" -- ----------- -- CASE logic > SELECT name, CASE WHEN biological_age > 35 THEN "old" ELSE "young" END AS "Youth" FROM patient; +----------------+-------+ | name | Youth | +----------------+-------+ | Alice | young | | Kaffka Tibor | old | | Mia | young | | Mumus | old | | Sam Lowry | old | | Sitka | young | | Tinta Vince | young | | Tyler Durden | young | | William Dickes | old | +----------------+-------+ -- ----------- -- New domain specific database/user > CREATE DATABASE spring; > CREATE USER 'spring'@'localhost' IDENTIFIED BY 'passwd'; > GRANT ALL PRIVILEGES ON spring.* TO 'spring'@'localhost'; > FLUSH PRIVILEGES; COMPLEX_EXAMPLES: -- Using the ANSII concatenation operator > SET sql_mode='PIPES_AS_CONCAT'; > SELECT p.name || " is insame." AS Fact FROM patient p ORDER BY RAND() LIMIT 1; +------------------+ | Fact | +------------------+ | Alice is insame. | +------------------+ -- Getting average of literals > SELECT SUM(t.s)/COUNT(t.s) FROM (SELECT 10 AS s UNION SELECT 90 AS s UNION SELECT 42 AS s) t; > SELECT AVG(t.s) FROM (SELECT 10 AS s UNION SELECT 90 AS s UNION SELECT 42 AS s) t; -- Getting the name of the youngest > SELECT p.name FROM (SELECT * FROM patient ORDER BY biological_age LIMIT 1) as p; -- OR > SELECT p.name FROM patient p WHERE biological_age = (SELECT MIN(i.biological_age) FROM patient i); "Tinta Vince" -- Getting the minimum and maximum biological age from patients -- who entered durring the XXI. century > SELECT MIN(p.biological_age) AS "Min age", MAX(p.biological_age) AS "Max age" FROM patient p WHERE YEAR(p.entered) BETWEEN 2001 AND 2100; | Min age | Max age | | 16 | 41 | -- Getting average age of floor 1 > SELECT AVG(p.biological_age) "Average of floor 1" FROM (SELECT patient.biological_age, room.floor FROM patient INNER JOIN room ON patient.room = room.num) p WHERE p.floor = 1; 26.7500 -- Getting average age by floor > SELECT AVG(p.biological_age) "Average age", p.floor FROM (SELECT patient.biological_age, room.floor FROM patient INNER JOIN room ON patient.room = room.num) p GROUP BY p.floor; 26.7500 | 1 31.4000 | 2 -- Getting a list of the psychopaths > SELECT p.name FROM patient p INNER JOIN illness_type t ON p.illness_type = t.id WHERE t.psychopaty = TRUE; "Tinta Vince" "Kaffka Tibor" "Sitka" "Tyler Durden" -- Getting the number of occuped rooms > SELECT COUNT(DISTINCT p.room) FROM patient p; 6 ~~~~~~~~~~~~~~~~~~~~ EMBEDDED SQL ~~~~~~~~~~~~~~~~~~~~ " \ When SQL was originally defined (at a time when NIH-PA Author \ Manuscript the concept of triggers did not exist) it lacked \ procedural constructs such as branching, loops and subroutines; \ it was intended to be embedded in a host language that had \ these features. " - Guidelines for the effective use of entity–attribute–value modeling for biomedical databases • also made by IBM • the idea is to place literal SQL statements to source files of compiled programming languages • a special preprocessor generates regular source code with library calls and macros substituted to the place of the SQL statements Support: DB2: dclgen • "IBM Database 2" • embedding is part of the toolkit ○ languages • C/C++ • FORTRAN • COBOL • REXX DBMS: • no info on how the tool chain is implemented ○ languages • C/C++ • COBOL • MUMPS PostgreSQL: ecpg • seems to be the only FOSS implementation in existence ○ languages • C/C++ Pro*C: • for Oracle Database Extensions: .ec(pp) : C source with embedded SQL; refers to ESQL/C(++) .sqc : C source with embedded SQL; combination of SQL and C .pgc : C source with embedded SQL; PostgreSQL specific Syntax: <start_token> <statement> <end_token> <start_token> - EXEC SQL <statement> - spiced SQL statement; see BELOW <end_token> - ';' not inside a string literal Special_Statements: CONNECT TO <db> AS <connection>; DISCONNECT <con> <connection> : one specific connection ALL : every connection Variables: :<name> • native language variable interfering with sql Example: $ psql -U anon -d testdb psql (15.2) Type "help" for help. testdb=> \d test Table "public.test" Column | Type | Collation | Nullable | Default --------+--------------+-----------+----------+--------- i | integer | | | text | character(4) | | | testdb=> select * from test; i | text ---+------ 1 | aaaa 3 | bbbb (2 rows) // @BAKE ecpg $@ // embeded_test.sqc #include <stdio.h> signed main(){ EXEC SQL BEGIN DECLARE SECTION; int i; char c[4]; EXEC SQL END DECLARE SECTION; EXEC SQL CONNECT TO testdb AS myconnection; EXEC SQL SELECT i, text INTO :i, :c FROM test WHERE i = 1; puts(c); return i; } -------------- SQLgen -------------- • "SQL GENerator" • 90's • library to ease creating graphical clients for databases • metadata itself is specified as SQL { User A | +--V---- Client Application -----------------+ | +------------------+| | application specific | SQLgen metadata || | objects | | || | <-----> libSQLgen || | +--------A---------+| +---------------------------------|----------+ | | +---------V----------+ | Database server | +--------------------+ }

nosql

#define nosql:: \ I----------------------------------------------------------\ I----------------------------------------------------------\ I----------------------------------------------------------\ I /$$ /$$ /$$$$$$ /$$$$$$ /$$ \ I | $$$ | $$ /$$__ $$ /$$__ $$| $$ \ I | $$$$| $$ /$$$$$$ | $$ \__/| $$ \ $$| $$ \ I | $$ $$ $$ /$$__ $$| $$$$$$ | $$ | $$| $$ \ I | $$ $$$$| $$ \ $$ \____ $$| $$ | $$| $$ \ I | $$\ $$$| $$ | $$ /$$ \ $$| $$/$$ $$| $$ \ I | $$ \ $$| $$$$$$/| $$$$$$/| $$$$$$/| $$$$$$$$ \ I |__/ \__/ \______/ \______/ \____ $$$|________/ \ I \__/ \ I----------------------------------------------------------\ I----------------------------------------------------------\ I----------------------------------------------------------I • umbrella term for databases which do no support SQL, but are employed in similar environments • primarily vertical databases created by developers fed-up with EAV • real use-case • bad implementations Mongodb: • nudev solution for data management • uses OO javascript • each record is stored as JSON • no tables, only collections which are arbitrary clusters of records • not recommended Programs: mongodb : daemon mongo : shell Language: • based on Js use <database> : select the current database to <databases>; if no such databse exists, its created namespace db <database> insertMany(list) find(object)