filetypes

#define filetypes: \ I-------------------------------------------------------------------------------------------------------\ I-------------------------------------------------------------------------------------------------------\ I-------------------------------------------------------------------------------------------------------\ I /$$$$$$$$ /$$ /$$ /$$$$$$$$ \ I | $$_____/|__/| $$ |__ $$__/ \ I | $$ /$$| $$ /$$$$$$ | $$ /$$ /$$ /$$$$$$ /$$$$$$ /$$$$$$$ \ I | $$$$$ | $$| $$ /$$__ $$ | $$| $$ | $$ /$$__ $$ /$$__ $$ /$$_____/ \ I | $$__/ | $$| $$| $$$$$$$$ | $$| $$ | $$| $$ \ $$| $$$$$$$$| $$$$$$ \ I | $$ | $$| $$| $$_____/ | $$| $$ | $$| $$ | $$| $$_____/ \____ $$ \ I | $$ | $$| $$| $$$$$$$ | $$| $$$$$$$| $$$$$$$/| $$$$$$$ /$$$$$$$/ \ I |__/ |__/|__/ \_______/ |__/ \____ $$| $$____/ \_______/|_______/ \ I /$$ | $$| $$ \ I | $$$$$$/| $$ \ I \______/ |__/ \ I-------------------------------------------------------------------------------------------------------\ I-------------------------------------------------------------------------------------------------------\ I-------------------------------------------------------------------------------------------------------I "file formats" EXTENSIONS: name.extension • the traditional way to identify filetypes • in a file name, everything after a dot is an extension • extensions are usually abbreviations of file types { script.sh -> SHell script } MIMETYPE: type/subtype • "media type" // i dont think anyone alive knows what "MIME" stands for • IANA defined file type classification system • binary mimetypes have magick bytes associated with them which go at the start of the file for identification purposes { text/plain video/mp4 application/octet-stream } Binary: pass Executable: ELF: #include <elf.h> • "Executable and Linkable Format" • de facto standard on *nix • cross architecture — header: 0x7F 'E' 'L' 'F' word size byte ○ extensions .elf .o .out .prx .puff .ko .mod .so Tools: readelf <flag>+ <file>+ Fat_binary: • contains the bytecode for multiple architectures • the header or an all compatible jmp is responsible for selecting the right chapter for the current architecture • used for convinience of distribution Plain_text: Value: { <value> } • the file contains only a single value • the "key" (identifier) it belongs to is deducted from the file name • UNIX like systems heavily use this format for their virtual files { // $ bat /proc/sys/kernel/panic_on_oops ───────┬────────────────────────────────────────── │ File: /proc/sys/kernel/panic_on_oops ───────┼────────────────────────────────────────── 1 │ 0 ───────┴────────────────────────────────────────── } List: { <value>(<separator><value>)+ } • some number of values segregated by a special token • <separator> is most commonly a new line ('\n') { // Python requirements.txt file numpy pandas matplotlib requests scikit-learn tensorflow beautifulsoup4 flask django pytorch } CSV:"Comma Separated Values" • used for storing spreadsheets and databases • 2 dimensional list files • all fields are separated by commas • the first line may store the name of the <int>th column { [Field_name1](,[...]) [Field_value1](,[...]) } cfg: { ( <key>=<value> ) } • may or may not be whitespace sensitive around the '=' ini: { ( <label> <assignment>+ )+ } ○ <label> [<string>] • signals in what context the assignments BELOW should be interpreted • serves as a sort of namespace ○ <assignment> <string-key> = <string-value> • eliminates redundancy which would be caused by key (name) prefixing • statements are context dependent -> harder and less flexible to parse # Qterminal ini file (partial) [General] AskOnExit=false fontSize=10 [MainWindow] ApplicationTransparency=5 pos=@Point(200 100) size=@Size(640 480) JSON:"JavaScript Object Notation" • deserialization format designed for storing Javascript objects as string • has no standard way to handle methods/inheritance • has no hard ties to Javascript • extremely versatile and flexible • adopted for many languages ○ extensions .json Values: • starts with white space and ends with white space { 3 } — string • "[...]" — number • int • float • scientific representation — bool • true • false • null — array • [[white_space][value](,[white_space][value](...))[white_space]] — object • key-value pairs • {[white_space]<string>[white_space]:[value](,[white_space]<string>[white_space]:[value](,[...]))} yaml: https://web.archive.org/web/20240309094004/https://ruudvanasseldonk.com/2023/01/11/the-yaml-document-from-hell"Yet Another Markup Language"/"Yaml Ain't Markup Language" • absolute cancer, TOTAL YAML DEATH (see AT the link ABOVE) ○ extensions .yml .yaml • super set of JSON ○ special symbols # : comment . --- : document start ... : document end ○ supports 2 syntax styles — block: • indentation determines the document structure • tabs are not allowed • a single space (difference) is enough • list items are prefixed by "- " — flow: • guarantees JSON compatibility Extention_refrence_table: c : C/C++ file (see AT "/C++/Files") cc : C/C++ file (see AT "/C++/Files") C : C/C++ file (see AT "/C++/Files") cs : C# File (see AT "/C#/Files") cp : C/C++ file (see AT "/C++/Files") cpp : C/C++ file (see AT "/C++/Files") cxx : C/C++ file (see AT "/C++/Files") c++ : C/C++ file (see AT "/C++/Files") i : C/C++ file (see AT "/C++/Files") ii : C/C++ file (see AT "/C++/Files") o : C/C++ file (see AT "/C++/Files") h : C/C++ file (see AT "/C++/Files") hpp : C/C++ file (see AT "/C++/Files") py : Python script file (see AT "/Python") pyc : Compiled Python script (see AT "/Python") sh : Shell script file (see AT "/Bash") md : Markdown file (see AT "/Documentation/Markdown") tex : LaTeX document (see AT "/Latex") html : HTML file (see AT "/HTML") css : CSS file (see AT "/CSS") js : JavaScript script file (see AT "/JavaScript") raw : Raw 3D Mesh; ascii plain text obj : Waveform 3D object; ascii plain text ply : Stanford University poligon object file

metadata

#define metadata:: \ __ __ _ _ _ \ | \/ | | | | | | | \ | \ / | ___| |_ __ _ __| | __ _| |_ __ _ \ | |\/| |/ _ \ __/ _` |/ _` |/ _` | __/ _` | \ | | | | __/ || (_| | (_| | (_| | || (_| | \ |_| |_|\___|\__\__,_|\__,_|\__,_|\__\__,_| I — get fucked buddy: • the tooling is terrible • arbirary limitations that does not even fit the task — seriously now, it should be this easy: • magic-byte + plaintext dictionary • reserve single chars of 7-bit ascii for special (encoded/tokenized) keys • non-alpha order is undefined • (perhaps) align pairs The above should be reasonably fast, storage efficient and takes an afternoon to implement, but big corpo choose sloppy self-deprecating headers instead for some reason? i dont get it.

id

#define id3::: \ ___ ___ ____ \ |_ _| \__ / \ | || || |_ \ \ |___|___/___/ I • the support is shit ○ fields +---------------+--------------+ | Field | Bytes | +---------------+--------------+ | header | 3 | | title | 30 | | artist | 30 | | album | 30 | | year | 4 | | comment | 28[7] or 30 | | zero-byte[7] | 1 | | track[7] | 1 | | genre | 1 | +---------------+--------------+ ○ genre • stored as an int +---------+------------------------+ | Number | Genre | +---------+------------------------+ | 00 | Blues | | 01 | Classic rock | | 02 | Country | | 03 | Dance | | 04 | Disco | | 05 | Funk | | 06 | Grunge | | 07 | Hip-Hop | | 08 | Jazz | | 09 | Metal | | 10 | New Age | | 11 | Oldies | | 12 | Other | | 13 | Pop | | 14 | Rhythm and Blues | | 15 | Rap | | 16 | Reggae | | 17 | Rock | | 18 | Techno | | 19 | Industrial | | 20 | Alternative | | 21 | Ska | | 22 | Death metal | | 23 | Pranks | | 24 | Soundtrack | | 25 | Euro-Techno | | 26 | Ambient | | 27 | Trip-Hop | | 28 | Vocal | | 29 | Jazz & Funk | | 30 | Fusion | | 31 | Trance | | 32 | Classical | | 33 | Instrumental | | 34 | Acid | | 35 | House | | 36 | Game | | 37 | Sound clip | | 38 | Gospel | | 39 | Noise | | 40 | Alternative Rock | | 41 | Bass | | 42 | Soul | | 43 | Punk | | 44 | Space | | 45 | Meditative | | 46 | Instrumental Pop | | 47 | Instrumental Rock | | 48 | Ethnic | | 49 | Gothic | | 50 | Darkwave | | 51 | Techno-Industrial | | 52 | Electronic | | 53 | Pop-Folk | | 54 | Eurodance | | 55 | Dream | | 56 | Southern Rock | | 57 | Comedy | | 58 | Cult | | 59 | Gangsta | | 60 | Top 40 | | 61 | Christian Rap | | 62 | Pop/Funk | | 63 | Jungle | | 64 | Native US | | 65 | Cabaret | | 66 | New Wave | | 67 | Psychedelic | | 68 | Rave | | 69 | Show tunes | | 70 | Trailer | | 71 | Lo-Fi | | 72 | Tribal | | 73 | Acid Punk | | 74 | Acid Jazz | | 75 | Polka | | 76 | Retro | | 77 | Musical | | 78 | Rock ’n’ Roll | | 79 | Hard rock | | 80 | Folk | | 81 | Folk-Rock | | 82 | National Folk | | 83 | Swing | | 84 | Fast Fusion | | 85 | Bebop | | 86 | Latin | | 87 | Revival | | 88 | Celtic | | 89 | Bluegrass | | 90 | Avantgarde | | 91 | Gothic Rock | | 92 | Progressive Rock | | 93 | Psychedelic Rock | | 94 | Symphonic Rock | | 95 | Slow rock | | 96 | Big Band | | 97 | Chorus | | 98 | Easy Listening | | 99 | Acoustic | | 100 | Humour | | 101 | Speech | | 102 | Chanson | | 103 | Opera | | 104 | Chamber music | | 105 | Sonata | | 106 | Symphony | | 107 | Booty bass | | 108 | Primus | | 109 | Porn groove | | 110 | Satire | | 111 | Slow jam | | 112 | Club | | 113 | Tango | | 114 | Samba | | 115 | Folklore | | 116 | Ballad | | 117 | Power ballad | | 118 | Rhythmic Soul | | 119 | Freestyle | | 120 | Duet | | 121 | Punk Rock | | 122 | Drum solo | | 123 | A cappella | | 124 | Euro-House | | 125 | Dancehall | | 126 | Goa | | 127 | Drum & Bass | | 128 | Club-House | | 129 | Hardcore Techno | | 130 | Terror | | 131 | Indie | | 132 | BritPop | | 133 | Negerpunk | | 134 | Polsk Punk | | 135 | Beat | | 136 | Christian Gangsta Rap | | 137 | Heavy Metal | | 138 | Black Metal | | 139 | Crossover | | 140 | Contemporary Christian | | 141 | Christian rock | | 142 | Merengue | | 143 | Salsa | | 144 | Thrash Metal | | 145 | Anime | | 146 | Jpop | | 147 | Synthpop | | 148 | Abstract | | 149 | Art Rock | | 150 | Baroque | | 151 | Bhangra | | 152 | Big beat | | 153 | Breakbeat | | 154 | Chillout | | 155 | Downtempo | | 156 | Dub | | 157 | EBM | | 158 | Eclectic | | 159 | Electro | | 160 | Electroclash | | 161 | Emo | | 162 | Experimental | | 163 | Garage | | 164 | Global | | 165 | IDM | | 166 | Illbient | | 167 | Industro-Goth | | 168 | Jam Band | | 169 | Krautrock | | 170 | Leftfield | | 171 | Lounge | | 172 | Math Rock | | 173 | New Romantic | | 174 | Nu-Breakz | | 175 | Post-Punk | | 176 | Post-Rock | | 177 | Psytrance | | 178 | Shoegaze | | 179 | Space Rock | | 180 | Trop Rock | | 181 | World Music | | 182 | Neoclassical | | 183 | Audiobook | | 184 | Audio theatre | | 185 | Neue Deutsche Welle | | 186 | Podcast | | 187 | Indie-Rock | | 188 | G-Funk | | 189 | Dubstep | | 190 | Garage Rock | | 191 | Psybient | +---------+------------------------+

exif

#define exif::: \ I _____ _____ ___ \ I | __\ \/ /_ _| __| \ I | _| > < | || _| \ I |___/_/\_\___|_| I "EXchangeable Image File format" • widely used — used for storing: • camera information and settings • date • location • thumbnail • notes • copyright — only defined to be applicable to a handful of formats: • jpeg • png • webp • tiff • wav (and variations) • tools (usually) refuse to operate on non-standard compatible filetypes Programgs: exif exiftool imageMagik

audio

#define audio:: \ I---------------------------------\ I _ _ \ I /\ | (_) \ I / \ _ _ __| |_ ___ \ I / /\ \| | | |/ _` | |/ _ \ \ I / ____ \ |_| | (_| | | (_) | \ I /_/ \_\__,_|\__,_|_|\___/ \ I---------------------------------I

mp

#define mp3::: \ __ __ ___ ____ \ | \/ | _ \__ / \ | |\/| | _/|_ \ \ |_| |_|_| |___/ I • compressed • quality is largely dependent on bit rate — common bit rates: 128 160 192 256 • metadata: ID3

flac

#define flac\ ___ _ _ ___ \ | __| | /_\ / __| \ | _|| |__ / _ \ (__ \ |_| |____/_/ \_\___| I "Free Lossless Audio Codec" • lossless

m

#define m4a\ __ __ _ _ _ \ | \/ | | | /_\ \ | |\/| |_ _/ _ \ \ |_| |_| |_/_/ \_\ I

m

#define m4a\ __ _____ __ \ \ \ / /_\ \ / / \ \ \/\/ / _ \ V / \ \_/\_/_/ \_\_/ I "WAVeform audio file"

wma

#define wma\ __ ____ __ _ \ \ \ / / \/ | /_\ \ \ \/\/ /| |\/| |/ _ \ \ \_/\_/ |_| |_/_/ \_\ I "Windows Media Audio"

aac

#define aac\ _ _ ___ \ /_\ /_\ / __| \ / _ \ / _ \ (__ \ /_/ \_\/_/ \_\___| I "Advanced Audio Coding"

image

#define image:: \ I--------------------------------------\ I _____ \ I |_ _| \ I | | _ __ ___ __ _ __ _ ___ \ I | | | '_ ` _ \ / _` |/ _` |/ _ \ \ I _| |_| | | | | | (_| | (_| | __/ \ I |_____|_| |_| |_|\__,_|\__, |\___| \ I __/ | \ I |___/ \ I--------------------------------------I

bmp

#define bmp::: \ ___ __ __ ___ \ | _ ) \/ | _ \ \ | _ \ |\/| | _/ \ |___/_| |_|_| I "BitMaP" • stores images as an uncompressed array of pixel values • alpha capable • lossless • very widely supported • implementations are simple • no decompression -> speed • ideal for performance sensitive tasks (games) (exception: ultra HD graphics where even the compressed textures will take 100s of GiBs to store) • very large image sizes

gif

#define gif::: \ ___ ___ ___ \ / __|_ _| __| \ | (_ || || _| \ \___|___|_| I "Graphic Interchange Format" • each pixel value is stored as a reference (index value) to a color table • supports animation • known as THE animated image format • only 256 different colors are allowed at any frame • its techically lossless, but fucks up the colors ○ processing convert to YCbCr -> segment -> — 127 -> DCT -> huffman -> quantification • widely supported • reasonable implementation comprexity • reasonable sizes • reasonable decoding times • ideal for transfering unimportant images (memes) over slow connections (early internet) for slow machines (early PCs) • mangles colors ruin the image quality • not ideal for long term storage

png

#define png::: \ ___ _ _ ___ \ | _ \ \| |/ __| \ | _/ .` | (_ | \ |_| |_|\_|\___| I "Portable Network Graphics"/"PNG is Not GIF" • RGB only • lossless • mildly difficult implementation • mildly expensive encoding

jpg

#define jpg\ #define jpeg::: \ _ ___ ___ ___ \ _ | | _ \ __/ __| \ | || | _/ _| (_ | \ \__/|_| |___\___| I "Joint (P)hotographic Experts Group" • 8x8 blocks • 3 channels • YUV • compresses well • lossy ( lossless variant is available) JP2:"JPeg 2000" • no blocks • exif metadata swapped out for XML • 256 channels • EBCOT compression • it is possible to store different parts of the same picture using different quality • decodable multiple resolutions (saving computational time {for thumbnails}) • high implementation complex • high decoding times • terrible support (i dont think i saw one in my life) // 2024