databases
#define databases: \
I--------------------------------------------------------------------------------------------------\
I--------------------------------------------------------------------------------------------------\
I--------------------------------------------------------------------------------------------------\
I /$$$$$$$ /$$ /$$ \
I | $$__ $$ | $$ | $$ \
I | $$ \ $$ /$$$$$$ /$$$$$$ /$$$$$$ | $$$$$$$ /$$$$$$ /$$$$$$$ /$$$$$$ /$$$$$$$ \
I | $$ | $$ |____ $$|_ $$_/ |____ $$| $$__ $$ |____ $$ /$$_____/ /$$__ $$ /$$_____/ \
I | $$ | $$ /$$$$$$$ | $$ /$$$$$$$| $$ \ $$ /$$$$$$$| $$$$$$ | $$$$$$$$| $$$$$$ \
I | $$ | $$ /$$__ $$ | $$ /$$ /$$__ $$| $$ | $$ /$$__ $$ \____ $$| $$_____/ \____ $$ \
I | $$$$$$$/| $$$$$$$ | $$$$/| $$$$$$$| $$$$$$$/| $$$$$$$ /$$$$$$$/| $$$$$$$ /$$$$$$$/ \
I |_______/ \_______/ \___/ \_______/|_______/ \_______/|_______/ \_______/|_______/ \
I--------------------------------------------------------------------------------------------------\
I--------------------------------------------------------------------------------------------------\
I--------------------------------------------------------------------------------------------------\
I--------------------------------------------------------------------------------------------------I
• see more AT "/SQL"
• most database engines have adopted a client-server architecture,
which is the right call in business settings,
however databases are common in standalone desktop applications too
where it makes very little sense;
please question what you are doing before shooting yourself in the foot
without ever even realizing there is a less painful way;
as of 2025 your only option is sqlite
○ types
• tree {filesystems}
• graph
• relational {sql}
• nosql (non-relational) {JSON databases}
• most things here implicitly apply to relational databases;
because that's by far the most widely used database model
Table: Table:
• structure of arrays
• a predefined list of attributes and entries
• table names are usually placed to plural {"users", "tickets"}
• technically just a bunch of struct-s (with some implementation details),
but its generally better to imagine as a vertical list
• a record is an entry inside a table
• using a terminology for those familiar with OOP,
a table is like a class,
a record is like an instance
{
Table "users"
| ID | Name | Email |
+----+-----------+--------------------+
• 0 emil admin@chud.cyou
• 1 xolatile xolatile@chud.cyou
}
Dependency: Dependency:
• when a value can be inferred from another value
• generally a sign of poor database design,
however rarely, its deliberate to address performance concerns
{
| user | domain | top_level_domain |
• anon chud cyou
| email | user | domain | top_level_domain |
• anon@chud.cyou anon chud cyou
}
Functional:
• 2 attribute's values have a one-to-one correspondence
• "egyértelmű hozzárendelés"^HU
Transitive:
• when a dependency has a dependency
• indirect functional dependency
{ A->B and B->C
therefor A->C
}
Relationships:
• a record expressing belonging to another record
(which could be from a different table)
• some pointer will have to be stored,
this could be simply an ID number
or a managed pointer (called a foreign key),
depending on the database and application in question
one_to_one:
• one record belongs to exactly one other record
/* Assuming every user has to register with an email,
* and one email can be used only for one registration.
*/
Table "users" Table "emails"
| Username | Email | | ID | Name | Domain |
+----------+-------+ +----+------+------------+
> emil 1 ────▶ 1 admin chud.cyou
one_to_many:
• the way to express lists purely with relationships
• multiple records will be pointing to the same record
Table "comments" Table "posts"
| ID | Text | Post | | ID | Text |
> 1 a... 1 ──┬─▶ > 1 ...
> 2 b... 1 ──┤
> 3 c... 1 ──┘
many_to_many:
• lists belong to lists
• a pivot table is a table expressing only relationships
• a pivot table is required to express many-to-many relationships
in relational databases
• pivot tables are usually not displayed when creating database graphs
• the name of a pivot table is usually the combination of the tables it joins
"students" "studentsCourses" "courses"
| ID | | student | course | | ID |
> 1 1 1 > 1
> 2 1 2 > 2
> 3 2 3 > 3
3 3
/* The arrows are omitted to prevent confusion.
* The way this can be read is:
* > student one took course one and two
* > course three was taken by student two and three
* Notice how the order it is put is irrelevant,
* there is no inherent one way or other.
*/
zero_or_one_to_one:
• "0/1-1"
• pattern
• null-ed foreign key dependent on another field
Key: Key:
• a key is a set of attributes by which a record is uniquely identifiable
• one or more columns of the original data set or an added id (num) row
Superkey:
• the set of attribute sets which can uniquely identify a record
• cannot contain attributes with possible NULL values
• the element of the super-key consisting of the most attributes
is called the trivial superkey
• all attributes excluded from the super-key form a candidate key
Candidate_key:
• all elements of the super-key which have no real subsets in the super-key
Primary_key:
• conventionally, when a table is displayed,
the primary key is always the first column
• many modern database engines always add an implicit primary if one was not specified
• a freely chosen candidate key or a newly added id row which
represents a freely chosen candidate key
• a natural key is a candidate key chosen to be the primary key
• an artifical key is a newly given, enum-erating primary key
— adding an id row is prefered when:
• the chosen candidate key consists of multiple attributes
• if the candidate key is an attribute holding complex data
(long strings which are relatively slow to read)
• id rows are most often incremental, ie. each new elements
id is the previous ones + 1
View:
• virtual table
— simplifies things:
• queries become shorter
• joins become transparent
• allows for row oriented security
Normal_forms: Normal_forms:
0.:
• anything that is not in 1st normal form
1.:
• primary keys are present
• every field is atomic ie. at no point will have to be split into multiple
parts by a user
• normalizing a data structure from 0th normal form strictly
to 1st normal form doesnt result in any new tables
being created
// People table; we know that parts of the address will be handled independently
Name | Age | Address |
Anon | 18 | 1204 Budapest, Nagysándor József utca 93 |
[...]
// Therefor we organize the individual parts into different columns
Name | Age | Post code | City | Address | Address number |
Anon | 18 | 1204 | Budapest | Nagysándor József utca | 93 |
[...]
2.:
• auto when the primary key is a single attribute
• functional dependencies of the primary key are eliminated
3.:
• all candidate key related functional dependencies are eliminated
3.5. (Boyce-Codd):
• all functional dependencies are eliminated
• a table in 3 normal form without multiple overlapping candidate
keys is always also in 3.5. normal form
• what you should generally aim for
4.:
pass
5.:
pass
Object_diagramm: Object_diagramm:
Attribute:
.--------------.
/ \
( [name] )
\ /
^--------------^
Multi_value_attribute:
• list/set/etc.
.--------------.
/ .------------. \
( ( [name] ) )
\ ^------------^ /
^--------------^
Key:
.--------------.
/ \
( đ[name]đ )
\ /
^--------------^
Partial_key:
.--------------.
/ \
( [name] )
\ ‾ ‾ ‾ ‾ /
^--------------^
struct:
• attribute made up from multiple attributes
.--------------. .--------------.
/ \ / \
( [att1] )──────────( [att2] )
\ / \ /
^--------------^ ^--------------^
Entity:
• connects attributes
.--------------. +----------------+ .--------------.
/ \ | | / \
( [att1] )──────────| [name] |──────────( [att2] )
\ / | | \ /
^--------------^ +----------------+ ^--------------^
│
│
.--------------.
/ \
( [att2] )
\ /
^--------------^
Relation:
• always connected to atleast 2 structs with continuous lines
• the num of attributes in a relation is called its degree
_-^-_
.--------------. _-^ ^-_ .--------------.
/ \ _-^ ^-_ / \
( [att1] )──────────< [name] >──────────( [att2] )
\ / ^-_ _-^ \ /
^--------------^ ^-_ _-^ ^--------------^
^-_-^
Vertical_database: Vertical_database:
• used for storing highly heterogeneous and or dynamic data
• common is scientific research and electronic patient record systems
— where you want a database, but have little to no clue about the schema:
• the data-safety is a must
• you wish to have external support from a big company
• new things keep popping up
EAV: EAV:
https://www.cedanet.com.au/antipatterns/eav.php
https://www.red-gate.com/simple-talk/opinion/opinion-pieces/bad-carma/
"\
In other words, EAV gives you enough rope to hang yourself and \
in this industry, things should be designed to the lowest level \
of complexity because the guy replacing you on the project will \
likely be an idiot. \
"
• "Entity-Attribute-Value"/"Object-Attribute-Value"
• "implementing a database inside a database"
• "column to row transformation" (called pivoting)
• pattern
• vertical databases implemented in a relational database system
• widely frowned upon
• no FOSS implementations at all
• under EAV most queries will require procedural code
• an abstraction layer must be provided to hide the complexity
• a query generator will have to be implemented
• comes up when the requirements are inadequate to create an application
(due to poor communication or otherwise)
• is a relational database engine
• adding new types shifts into an administrative responsibility
• hard to maintain
• hard to comprehend
• slow
• requires much work up-front
• administrators are usually not very good at adding new types
— by-passes many builtin relational database features:
• constraints
• disk optimization
• query optimization
• views
• caching
• time-sharing
• trigger based validation
entity:
• all entities are stored in the same table
• the non-descriptive name "entity" is used on purpose,
because it marks anything from living beings
to experiment findings
• all entities reference a metadata (see BELOW)
• multiple entity entries form a single coherent record
CREATE TABLE entity (
id INT PRIMARY KEY,
attibute INT FOREIGN KEY(metadata),
value string
);
• optionally the entity could be a foreign key itself,
referencing a conventional table
{there is a users table with their rudimentary personal data,
but their other "belongings" are stored in EAV
}
attribute:
• metadata
• stores the type of an entity
• no standard way to implement it
— rough approximation of columns:
• validation {type; range; regex}
• GUI presentation
• normal value (in a clinical sense)
• grouping (parent/sibling relationship for something like a form)
• description
• keywords (for searching)
value:
• the concrete data
• unlike the rest, it would be stored in a field normally too
• the simplest implementation is to use a string
• could come with an extra field,
which marks what table the value is a foreign key to;
ie. you could have a value table for each SQL datatype,
and the value column reference one of them
EAV_CR:
PMID: 10579606
• "EAV with Classes and Relationships"
• the OOP of EAV
• allows for nesting
• the Value becomes the foreign key of a class table
sql
#define sql:: \
I------------------------------------------------------\
I------------------------------------------------------\
I------------------------------------------------------\
I /$$$$$$ /$$$$$$ /$$ \
I /$$__ $$ /$$__ $$| $$ \
I | $$ \__/| $$ \ $$| $$ \
I | $$$$$$ | $$ | $$| $$ \
I \____ $$| $$ | $$| $$ \
I /$$ \ $$| $$/$$ $$| $$ \
I | $$$$$$/| $$$$$$/| $$$$$$$$ \
I \______/ \____ $$$|________/ \
I \__/ \
I------------------------------------------------------\
I------------------------------------------------------\
I------------------------------------------------------I
• "Structured Query Language"
• database management language
• initially developed by IBM
• ANSI standard
• attempted to be usable by businessmen, hence the aspiration to resemble natural language;
failed miserably as such aspirations repeatedly do
• structurally and logically good, the language semantics are highly questionable,
but well legible given practice
• SQL is split into multiple "component" languages;
these are very similar in logic and syntax,
but are concerned with different aspects of database management;
each is detailed seperately
Example_Data: Example_Data:
• the following describes an insane asylum
• in this hypothetical place patients dont age after entry
+====================================================================+
| Patient |
+====================================================================+
| name | biological_age | room | entered | illness_type |
+====================================================================+
| Alice | 17 | 2 | 1866-07-01 | 10 |
| Kaffka Tibor | 41 | 18 | 2020-04-05 | 6 |
| Mia | 21 | 2 | 2132-02-17 | 2 |
| Mumus | 45 | 2 | 2120-09-30 | 2 |
| Sam Lowry | 38 | 22 | 1985-08-30 | 2 |
| Sitka | 19 | 24 | 2019-09-04 | 7 |
| Tinta Vince | 16 | 23 | 2006-01-06 | 1 |
| Tyler Durden | 24 | 2 | 1999-10-15 | 9 |
| William Dickes | 43 | 11 | 1884-04-04 | 3 |
+----------------+----------------+------+------------+--------------+
+===============================================================+
I Illness States I
+===============================================================+
I psychopaty | delusion | obsession | ADHD | schizophrenia | id I
+===============================================================+
| 1 | 0 | 0 | 0 | 0 | 1 |
| 0 | 1 | 0 | 0 | 0 | 2 |
| 0 | 0 | 1 | 0 | 0 | 3 |
| 0 | 0 | 0 | 1 | 0 | 4 |
| 0 | 0 | 0 | 0 | 1 | 5 |
| 1 | 1 | 0 | 0 | 0 | 6 |
| 1 | 0 | 0 | 1 | 0 | 7 |
| 1 | 1 | 0 | 1 | 0 | 8 |
| 1 | 0 | 1 | 0 | 1 | 9 |
| 0 | 1 | 0 | 0 | 1 | 10 |
+------------+----------+-----------+------+---------------+----+
+====================+
| Room |
+====================+
| num | floor | beds |
+====================+
| 2 | 1 | 4 |
| 11 | 2 | 3 |
| 18 | 2 | 2 |
| 22 | 2 | 1 |
| 23 | 2 | 1 |
| 24 | 2 | 1 |
| 25 | 2 | 1 |
+-----+-------+------+
○ you may reproduce it with the following (in MySQL)
{ CREATE DATABASE Asylum;
CREATE TABLE patient (name VARCHAR(128) DEFAULT "N/A" NOT NULL,
biological_age INT NOT NULL,
room INT,
entered DATE DEFAULT CURRENT_TIMESTAMP NOT NULL,
illness_type INT NOT NULL);
CREATE TABLE room (num INT not NULL,
floor INT DEFAULT 1 NOT NULL,
beds INT DEFAULT 1);
CREATE TABLE illness_type (psychopaty BOOLEAN DEFAULT FALSE,
delusion BOOLEAN DEFAULT FALSE,
obsession BOOLEAN DEFAULT FALSE,
ADHD BOOLEAN DEFAULT FALSE,
schizophrenia BOOLEAN DEFAULT FALSE);
ALTER TABLE patient ADD PRIMARY KEY(name);
ALTER TABLE room ADD PRIMARY KEY(num);
ALTER TABLE illness_type ADD id INT AUTO_INCREMENT PRIMARY KEY;
INSERT INTO room (num, floor, beds) VALUES (2, 1, 4),
(11, 2, 3),
(18, 2, 2),
(22, 2, 1),
(23, 2, 1),
(24, 2, 1),
(25, 2, 1);
INSERT INTO illness_type (psychopaty, delusion, obsession, ADHD, schizophrenia)
VALUES (TRUE, FALSE, FALSE, FALSE, FALSE),
(FALSE, TRUE, FALSE, FALSE, FALSE),
(FALSE, FALSE, TRUE, FALSE, FALSE),
(FALSE, FALSE, FALSE, TRUE, FALSE),
(FALSE, FALSE, FALSE, FALSE, TRUE),
(TRUE, TRUE, FALSE, FALSE, FALSE),
(TRUE, FALSE, FALSE, TRUE, FALSE),
(TRUE, TRUE, FALSE, TRUE, FALSE),
(TRUE, FALSE, TRUE, FALSE, TRUE),
(FALSE, TRUE, FALSE, FALSE, TRUE);
INSERT INTO patient (name, biological_age, room, entered, illness_type)
VALUES ("William Dickes", 43, 11, DATE("1884-04-04"), 3),
("Tinta Vince", 16, 23, DATE("2006-01-06"), 1),
("Mumus", 45, 2, DATE("2120-09-30"), 2),
("Kaffka Tibor", 41, 18, DATE("2020-04-05"), 6),
("Alice", 17, 2, DATE("1866-07-01"), 10),
("Sitka", 19, 24, DATE("2019-09-04"), 7),
("Tyler Durden", 24, 2, DATE("1999-10-15"), 9),
("Mia",21, 2, DATE("2132-02-17"), 2),
("Sam Lowry", 38, 22, DATE("1985-08-30"), 2);
ALTER TABLE patient ADD CONSTRAINT FOREIGN KEY (illness_type) REFERENCES illness_type(id);
ALTER TABLE patient ADD CONSTRAINT FOREIGN KEY (room) REFERENCES room(num);
PROGRAMS: PROGRAMS:
----------------------
sqlite: sqlite:
----------------------
• distributed model
• reads/writes directly to memory
• can use (single) files, runtime memory or be embedded into the executable
sqlite3 [options] [FILE] [sql] : interactive sqlite shell
Common_commands:
• these are special directives for the sqlite shell
.shell [...] : access the system shell
.schema [table] : get the CREATE command printed for a table
.exit
.quit
"C/3th party libraries/sqlite3"
--------------------------
postgresql: postgresql:
--------------------------
• server-client model
psql [options] -U <username> -d <database> : interactive postgresql shell
Commands:
\d <table> : describe <table>
---------------------
mysql: mysql:
---------------------
• server-client model
• running mariadb
mysql [...]
Set_up:
. ### Running Mariadb ###
1. Install the mariadb package
2. Install mariadb
mariadb-install-db --user=mysql --basedir=/usr --datadir=/var/lib/mysql
3. Run the daemon
systemctl start mariadb
#
### Using MySQL as a normal user ###
• since MySQL uses an internal system for identifying users a few hoops must be jumped
• root is always preregistered
1. Log in as root
$ sudo mysql
2. Create a new user
CREATE USER '<your_name>'@'localhost' IDENTIFIED BY '<password>';
• <password> doesnt have to be the same as the user's system login password
3. Grant privileges to the new user
GRANT ALL PRIVILEGES ON *.* TO '<your_name>'@'localhost';
FLUSH PRIVILEGES;
• this grants all privileges, which may not be a very bright idea on a shared system;
granting partial privileges is recommended and detailed BELOW
4. Exit
exit
5. Log in as yourself
mysql --user=<your_username> -p
— the following alias is recommended (see AT "/Bash/Builtins/alias")
alias mysql="mysql --user=${USER} -p"
#
mysql [options] : starts (interactive) sql client
--host=[domain] : sets host
--port=<int> : sets port
--user=<string> : specifies username to use
--password=<string> : specifies pass-word to use
— p : prompt for pass-word; recommended over "--password" (see ABOVE)
GNU Readline library : used for all input operations (see AT "C++/Readline\/History")
Files:
.mysql_history : contains the list of previously interactively entered lines
Environment_variabes:
${EDITOR} : read by the "edit" command (see BELOW)
${MYSQL_PS1} : prompt string 1; the string which is displayed each time before the interpreter allows for input
Commands:
edit || \e : opens a temporary file with the users set ${EDITOR}; uppon reaching the
next ';' the text from this temporary file is run as a script
source [file] : run [file] as a script
use [database] : selects [database] as the database to operate on
exit : terminate normally
mycli [options] : mysql wrapper with auto completion and syntax highlighting
flag -p is a pitfall, see BELOW
— p <string> : pass authentication password as <string>;
if ommited, securely asking on stdin is the default
— u <string> : pass user to use as <string>; the default is mysql
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Other_implementations: Other_implementations:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+---------------------------+---------------------------+
| Parent Company | Database |
+---------------------------+---------------------------+
| Microsoft Corporation | Microsoft SQL Server |
| Oracle Corporation | Oracle Database |
| IBM | IBM Db2 |
| Amazon Web Services (AWS) | Amazon Aurora |
| Google | Google Cloud Spanner |
+---------------------------+---------------------------+
FILES: FILES:
.sql : SQL script
.pls : PL/SQL script
.pks : SQL package
.pkb : SQL package body
.tab : SQL script for table creation
.tbl : SQL script for table creation
COMMENTS: COMMENTS:
-- <string> : single line comment
SEMICOLONS: SEMICOLONS:
• all sql statements are terminated by a single semicolon ';'
• sql statements are not run until a semicolon is encountered
• interpreter commands (see ABOVE) however dont take semicolons {exit}
typenames: typenames:
Logical:
BIT
BOOLEAN
Numeric:
Integer:
+-----------+---------+--------------------+----------------+
| Name | Storage | Signed Range | Unsigned Range |
+-----------+---------+--------------------+----------------+
| TINYINT | 1 | -128, 127 | 0, 255 |
| SMALLINT | 2 | -32768, 32767 | 0, 65535 |
| MEDIUMINT | 3 | -8388608, 8388607 | 0, 16777215 |
| INTEGER | 4 | -2^31, 2^31-1 | 0, 2^32-1 |
| BIGINT | 8 | -2^63, 2^63-1 | 0, 2^64-1 |
+-----------+---------+--------------------+----------------+
DECIMAL
NUMERIC
FLOAT
REAL
DOUBLE PERCISION
Time:
DATE
TIME
DATETIME
TIMESTAMP
YEAR
— zero values:
+------------+---------------------+
| Data Type | “Zero” Value |
+------------+---------------------+
| DATE | 0000-00-00 |
| TIME | 00:00:00 |
| DATETIME | 0000-00-00 00:00:00 |
| TIMESTAMP | 0000-00-00 00:00:00 |
| YEAR | 0000 |
+------------+---------------------+
String:
CHAR(<int>) : <int> long char array; uninitialized indexes are filled in with a specific value (usually spaces);
used when the data is known to have a consistent length
VARCHAR(<int>) : "VARiable CHAR"; <int> long char array; uninitialized indexes are ignored, however preserved regardless
BINARY
VARBINARY
BLOB
TEXT
ENUM
SET
OPERATORS: OPERATORS:
General:
Logical:
[column] = [value] : equals; SQL uses a single '=' instead of 2 for
comparisons oppose to most computer languages;
however its only logical as it doesnt have an assignment operator
[column] != [value] : not equals
[column] > [value] : greater than
[column] < [value] : lesser than
[column] >= [value] : greater than or equals to
[column] <= [value] : lesser than or equals to
[column] BETWEEN [value-1] AND [value-2]
NOT [bool]
[bool-1] AND [bool-2]
[bool-1] OR [bool-2]
[string-1] LIKE [string-2] : regex match
[regex]:
% : match any number of chars
_ : match exactly one char
Numeric:
—
+
*
/
% MOD
String:
|| : concatenation; support is disabled by default; XXX: SET sql_mode='PIPES_AS_CONCAT';
IN:
pass
○ examples:
-- LIKE; select patients with 'y' char in their name;
-- NOTE: normally either LOWER() or UPPER would be
-- called for humanly logical results,
-- however the example tries to keep it simple
> SELECT p.name FROM patient p WHERE p.name LIKE '%y%';
-- -----------
-- --- IN ---
> DELETE FROM patient WHERE biological_age IN (SELECT biological_age FROM patient p WHERE p.biological_age > 40);
DDL: DDL:
• "Data Definition Language"
CREATE [object] : creates [object]
DATABASE <string> : database named <string>
TABLE <string> ([decl-1(, [decl-num])*]) : table named <string> with columns defined by [decl-uint]
[decl]:
<string> <typename>( DEFAULT [value]( NOT NULL))
[value]:
<int>
"<string>"
CURRENT_TIMESTAMP
DML: DML:
• "Data Manipulation Language"
DELETE FROM [table] ([where]) : not specifying [where] will select (ie. delete) all rows
DROP [object] <string>
INSERT INTO [table] ([columns]+) VALUES ([value]+)
UPDATE [table] SET ([column] = [value])+ ([where])
ALTER TABLE <string> [cmd]
ADD [?!]
[column]
PRIMARY KEY([column name])
CONSTRAINT [constraint]
[constraint]
<string> [type]
[type]
[column_type] [column]+
PRIMARY KEY
UNIQUE
[check_type]
CHECK([logic])
FOREIGN KEY [column] REFERENCES [column] [on_delete]
[on_delete]:
• the action to perform when the referenced column is delete-d
ON DELETE [wat_do]
[wat_do]
CASCADE : remove children too
DROP [column]
MODIFY [column]
RENAME COLUMN [string-1] TO [string-2]
[var] = [value]
AUTO_INCREMENT
// --- CHECK constraint ---
> ALTER TABLE room ADD CONSTRAINT floor_num CHECK(floor BETWEEN -2 AND 100);
DQL: DQL:
• "Data Query Langauge"
SHOW [object]
DATABASES
TABLES
COLUMNS FROM [table]
DESCRIBE [table]
Selecttion:
[select] ([where] ([order]) ([limit]))
[select] ([having] ([order]) ([limit]))
[select]:
SELECT [subject] (AS <string>) FROM [table] (AS <string>) : returns information according to [subject] from inside [table];
see explanation of "AS" keywords BELOW as alias-ing
Aliasing:
[...] AS <string> : gives [...] a temporary name for the span of the query; column names given by
functions are not very human readable, therefor its good practice to always alias these
[subject]:
[column]
([rep]) *
[rep]:
ALL
DISTINCT
[where]:
WHERE [logic] : specifies criteria to be met by selected data; [logic] is a sane logical
expression composed of operator-s and data (see AT "../Operators")
[having]:
• as oppose to WHERE, its logic can contain aggregate functions
HAVING [logic]
[order]:
ORDER BY [column] [aspect]
DESC
ASC
RAND()
[limit]:
LIMIT <int> : limits the amount of records return-ed to a maximum of <int>
GROUP BY
View:
CREATE VIEW [name] AS [selection];
• a saved and named select query
• '*' expansion causes new columns to show up
○ examples
{ -- Most basic querry
> SELECT * FROM patient;
+----------------+----------------+------+------------+--------------+
| name | biological_age | room | entered | illness_type |
+----------------+----------------+------+------------+--------------+
| Alice | 17 | 2 | 1866-07-01 | 10 |
| Kaffka Tibor | 41 | 18 | 2020-04-05 | 6 |
| Mia | 21 | 2 | 2132-02-17 | 2 |
| Mumus | 45 | 2 | 2120-09-30 | 2 |
| Sam Lowry | 38 | 22 | 1985-08-30 | 2 |
| Sitka | 19 | 24 | 2019-09-04 | 7 |
| Tinta Vince | 16 | 23 | 2006-01-06 | 1 |
| Tyler Durden | 24 | 2 | 1999-10-15 | 9 |
| William Dickes | 43 | 11 | 1884-04-04 | 3 |
+----------------+----------------+------+------------+--------------+
-- -----------
-- DISTINT; list all rooms with someone living in it, discard duplicates
> SELECT DISTINCT p.room FROM patient p;
2
11
18
22
23
24
-- -----------
-- ORDER BY; list names in reverse alphabetical order
> SELECT p.name FROM patient p ORDER BY name DESC;
"William Dickes"
"Tyler Durden"
"Tinta Vince"
"Sitka"
"Sam Lowry"
"Mumus"
"Mia"
"Kaffka Tibor"
"Alice"
-- -------------
-- --- JOIN ---
-- INNER
> SELECT patient.name, room.floor FROM patient INNER JOIN room ON patient.room = room.num;
Alice | 1
Mia | 1
Mumus | 1
Tyler Durden | 1
William Dickes | 2
Kaffka Tibor | 2
Sam Lowry | 2
Tinta Vince | 2
Sitka | 2
-- -------------
-- VIEW
> CREATE VIEW allp AS SELECT * FROM patient;
> SELECT * FROM allp;
...
-- HAVING; listing the room which have more than one occupant
> SELECT num FROM (room LEFT JOIN patient ON room.num = patient.room) HAVING COUNT(room.num) > 1;
-- -----------
-- NESTED SELECT; people who are older than the average
> SELECT name FROM patient WHERE biological_age > (SELECT AVG(biological_age) FROM patient);
DCL: DCL:
• "Data Control Language"
• deals with users and user permissions
GRANT [what] TO [who]
[what]:
[role]
[privilege] [subject]
[privileges]:
ALL PRIVILEGES
SELECT
UPDATE
DELETE
[subject]:
ON [database].[table]
ANY TABLE
[who]:
[user]
[role]
CREATE USER '[string-1]'@'[string-2]' IDENTIFIED BY '[string-3]'
CREATE ROLE <string>
TCL: TCL:
• "Transaction Control Language"
• a transaction is group of safe SQL operations
• if any one of the operations fail, the state of the database
can be easily restored to the one before the transaction
BEGIN : start a transaction
SAVEPOINT <name> : establish a state within a transaction which
can also be rolled back to
COMMIT : finish and finalize the transaction
ROLLBACK (TO <savepoint>) : restore the state to before the transaction
or to a savepoint
Functions: Functions:
RAND()
NVL([value-1], [value-2]) : returns [value-1] unless it is NULL, in which case it returns [value-2]
COALESCE([...]) : returns the first argument which is not NULL
Column_oriented:
COUNT([column])
AVG([column])
Number_specific:
MIN([column])
MAX([column])
SUM([column])
Num_oriented:
ROUND([int-1], [int-2]) : returns [int-1] rounded to [int-2] nums of decimal places
TRUNC([int-1], [int-2]) : returns [int-1] truncated to [int-2] nums of decimal places; [int-2] is free to me negative
ABS([num]) : returns the absolute value of [num]
POWER([num], <int>) : returns [num] raised to the <int>th power
SQRT([num]) : returns the square root of [num]
Trigonometry:
SIN([num]) : returns the sine
COS([num]) : returns the cosine
TAN([num]) : returns the tangent
COT([num]) : returns the cotangent
ASIN([num]) : returns the arc sine
ACOS([num]) : returns the arc cosine
ATAN([num]) : returns the arc tangent
String_oriented:
ASCII([char]) : returns [char]s ascii code
UPPER(<string>) : returns <string> converted to all upper case
LOWER(<string>) : returns <string> converted to all lower case
INITCAP(<string>) : returns <string> with initials capitalized
LENGTH(<string>) : returns <string>s length; when operating on CHAR(<int>) the length is always equal to <int>
TRIM(<string>) : returns <string> with all preceding and leading whitespaces removed
TRIM([string-1] FROM [string-2]) : returns [string-2] with all preceding and leading chars contained in [string-1] removed
REPLACE([string-1], [string-2]) : returns [string-1] with all occurrences of [string-2] delete-d
REPLACE([string-1], [string-2], [string-3]) : returns [string-1] with all occurrences of [string-2] replaced with [string-3]
SUBSTR(<string>, [int-1], [int-2]) : returns [int-2] chars from <string>s [int-1]th positions;
[int-1] is 1 indexed and negative values are counted from the back
LPAD([string-1], <int>, [string-2]) : returns [string-1] cropped/left padded to <int> chars; [string-2] is the padding
RPAD([string-1], <int>, [string-2]) : returns [string-1] cropped/right padded to <int> chars; [string-2] is the padding
MD5(<string>) : returns md5 hash of <string>
Date_oriented:
STR_TO_DATE("<string>", "[format]") : returns DATE constructed from <string> which is of format [format]
DATE("<string>") : returns DATE constructed from <string> which is of format <string> is of the format "%Y-%m-%d"
EXTRACT([dpart] FROM [DATE]) : return [dpart] value from [DATE]
[dpart]([DATE]) : return [dpart] value from [DATE]
[dpart]:
YEAR
MONTH
DAY
HOUR
MINUTE
SECOND
[format]:
%a : abbreviated weekday name (Sun to Sat)
%b : abbreviated month name (Jan to Dec)
%c : numeric month name (0 to 12)
%D : day of the month as a numeric value, followed by suffix (1st, 2nd, 3rd, ...)
%d : day of the month as a numeric value (01 to 31)
%e : day of the month as a numeric value (0 to 31)
%f : microseconds (000000 to 999999)
%H : hour (00 to 23)
%h : hour (00 to 12)
%I : hour (00 to 12)
%i : minutes (00 to 59)
%j : day of the year (001 to 366)
%k : hour (0 to 23)
%l : hour (1 to 12)
%M : month name in full (January to December)
%m : month name as a numeric value (01 to 12)
%p : aM or PM
%r : time in 12 hour AM or PM format (hh:mm:ss AM/PM)
%S : seconds (00 to 59)
%s : seconds (00 to 59)
%T : time in 24 hour format (hh:mm:ss)
%U : week where Sunday is the first day of the week (00 to 53)
%u : week where Monday is the first day of the week (00 to 53)
%V : week where Sunday is the first day of the week (01 to 53). Used with %X
%v : week where Monday is the first day of the week (01 to 53). Used with %X
%W : weekday name in full (Sunday to Saturday)
%w : day of the week where Sunday=0 and Saturday=6
%X : year for the week where Sunday is the first day of the week. Used with %V
%x : year for the week where Monday is the first day of the week. Used with %V
%Y : year as a numeric, 4-digit value
%y : year as a numeric, 2-digit value
Case:
CASE [logic] (THEN [value])+ (ELSE [value]) END
○ examples
-- AVG
> SELECT AVG(p.biological_age) FROM patient p;
29.3333
-- -----------
-- COUNT
> SELECT COUNT(p.name) FROM patient p;
9
-- -----------
-- MIN/MAX
> SELECT MIN(p.biological_age) FROM patient p;
16
-- -----------
-- SUM
> SELECT SUM(p.biological_age) FROM patient p;
264
-- -----------
-- ROUND
> SELECT ROUND(AVG(p.biological_age), 0) FROM patient p;
29
-- -----------
-- STR_TO_DATE
> SELECT SUM(p.biological_age) FROM patient p;
"Alice"
"William Dickes"
-- -----------
-- CASE logic
> SELECT name, CASE WHEN biological_age > 35 THEN "old" ELSE "young" END AS "Youth" FROM patient;
+----------------+-------+
| name | Youth |
+----------------+-------+
| Alice | young |
| Kaffka Tibor | old |
| Mia | young |
| Mumus | old |
| Sam Lowry | old |
| Sitka | young |
| Tinta Vince | young |
| Tyler Durden | young |
| William Dickes | old |
+----------------+-------+
-- -----------
-- New domain specific database/user
> CREATE DATABASE spring;
> CREATE USER 'spring'@'localhost' IDENTIFIED BY 'passwd';
> GRANT ALL PRIVILEGES ON spring.* TO 'spring'@'localhost';
> FLUSH PRIVILEGES;
COMPLEX_EXAMPLES: COMPLEX_EXAMPLES:
-- Using the ANSII concatenation operator
> SET sql_mode='PIPES_AS_CONCAT';
> SELECT p.name || " is insame." AS Fact FROM patient p ORDER BY RAND() LIMIT 1;
+------------------+
| Fact |
+------------------+
| Alice is insame. |
+------------------+
-- Getting average of literals
> SELECT SUM(t.s)/COUNT(t.s) FROM (SELECT 10 AS s UNION
SELECT 90 AS s UNION
SELECT 42 AS s) t;
> SELECT AVG(t.s) FROM (SELECT 10 AS s UNION
SELECT 90 AS s UNION
SELECT 42 AS s) t;
-- Getting the name of the youngest
> SELECT p.name FROM (SELECT * FROM patient ORDER BY biological_age LIMIT 1) as p;
-- OR
> SELECT p.name FROM patient p WHERE biological_age = (SELECT MIN(i.biological_age) FROM patient i);
"Tinta Vince"
-- Getting the minimum and maximum biological age from patients
-- who entered durring the XXI. century
> SELECT MIN(p.biological_age) AS "Min age", MAX(p.biological_age) AS "Max age"
FROM patient p WHERE YEAR(p.entered) BETWEEN 2001 AND 2100;
| Min age | Max age |
| 16 | 41 |
-- Getting average age of floor 1
> SELECT AVG(p.biological_age) "Average of floor 1"
FROM (SELECT patient.biological_age, room.floor
FROM patient INNER JOIN room ON patient.room = room.num) p
WHERE p.floor = 1;
26.7500
-- Getting average age by floor
> SELECT AVG(p.biological_age) "Average age",
p.floor
FROM (SELECT patient.biological_age, room.floor
FROM patient INNER JOIN room ON patient.room = room.num) p
GROUP BY p.floor;
26.7500 | 1
31.4000 | 2
-- Getting a list of the psychopaths
> SELECT p.name FROM patient p
INNER JOIN illness_type t ON p.illness_type = t.id
WHERE t.psychopaty = TRUE;
"Tinta Vince"
"Kaffka Tibor"
"Sitka"
"Tyler Durden"
-- Getting the number of occuped rooms
> SELECT COUNT(DISTINCT p.room) FROM patient p;
6
~~~~~~~~~~~~~~~~~~~~
EMBEDDED SQL
~~~~~~~~~~~~~~~~~~~~
" \
When SQL was originally defined (at a time when NIH-PA Author \
Manuscript the concept of triggers did not exist) it lacked \
procedural constructs such as branching, loops and subroutines; \
it was intended to be embedded in a host language that had \
these features.
" - Guidelines for the effective use of entity–attribute–value modeling for biomedical databases
• also made by IBM
• the idea is to place literal SQL statements to source files
of compiled programming languages
• a special preprocessor generates regular source code with library calls
and macros substituted to the place of the SQL statements
Support:
DB2:
dclgen
• "IBM Database 2"
• embedding is part of the toolkit
○ languages
• C/C++
• FORTRAN
• COBOL
• REXX
DBMS:
• no info on how the tool chain is implemented
○ languages
• C/C++
• COBOL
• MUMPS
PostgreSQL:
ecpg
• seems to be the only FOSS implementation in existence
○ languages
• C/C++
Pro*C:
• for Oracle Database
Extensions:
.ec(pp) : C source with embedded SQL; refers to ESQL/C(++)
.sqc : C source with embedded SQL; combination of SQL and C
.pgc : C source with embedded SQL; PostgreSQL specific
Syntax:
<start_token> <statement> <end_token>
<start_token> - EXEC SQL
<statement> - spiced SQL statement; see BELOW
<end_token> - ';' not inside a string literal
Special_Statements:
CONNECT TO <db> AS <connection>;
DISCONNECT <con>
<connection> : one specific connection
ALL : every connection
Variables:
:<name>
• native language variable interfering with sql
Example:
$ psql -U anon -d testdb
psql (15.2)
Type "help" for help.
testdb=> \d test
Table "public.test"
Column | Type | Collation | Nullable | Default
--------+--------------+-----------+----------+---------
i | integer | | |
text | character(4) | | |
testdb=> select * from test;
i | text
---+------
1 | aaaa
3 | bbbb
(2 rows)
// @BAKE ecpg $@
// embeded_test.sqc
#include <stdio.h>
signed main(){
EXEC SQL BEGIN DECLARE SECTION;
int i;
char c[4];
EXEC SQL END DECLARE SECTION;
EXEC SQL CONNECT TO testdb AS myconnection;
EXEC SQL SELECT i, text INTO :i, :c FROM test WHERE i = 1;
puts(c);
return i;
}
--------------
SQLgen
--------------
• "SQL GENerator"
• 90's
• library to ease creating graphical clients for databases
• metadata itself is specified as SQL
{
User
A
|
+--V---- Client Application -----------------+
| +------------------+|
| application specific | SQLgen metadata ||
| objects | | ||
| <-----> libSQLgen ||
| +--------A---------+|
+---------------------------------|----------+
|
|
+---------V----------+
| Database server |
+--------------------+
}
nosql
#define nosql:: \
I----------------------------------------------------------\
I----------------------------------------------------------\
I----------------------------------------------------------\
I /$$ /$$ /$$$$$$ /$$$$$$ /$$ \
I | $$$ | $$ /$$__ $$ /$$__ $$| $$ \
I | $$$$| $$ /$$$$$$ | $$ \__/| $$ \ $$| $$ \
I | $$ $$ $$ /$$__ $$| $$$$$$ | $$ | $$| $$ \
I | $$ $$$$| $$ \ $$ \____ $$| $$ | $$| $$ \
I | $$\ $$$| $$ | $$ /$$ \ $$| $$/$$ $$| $$ \
I | $$ \ $$| $$$$$$/| $$$$$$/| $$$$$$/| $$$$$$$$ \
I |__/ \__/ \______/ \______/ \____ $$$|________/ \
I \__/ \
I----------------------------------------------------------\
I----------------------------------------------------------\
I----------------------------------------------------------I
• umbrella term for databases which do no support SQL,
but are employed in similar environments
• primarily vertical databases created by developers fed-up with EAV
• real use-case
• bad implementations
Mongodb:
• nudev solution for data management
• uses OO javascript
• each record is stored as JSON
• no tables, only collections which are arbitrary clusters of records
• not recommended
Programs:
mongodb : daemon
mongo : shell
Language:
• based on Js
use <database> : select the current database to <databases>;
if no such databse exists, its created
namespace db
<database>
insertMany(list)
find(object)